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The  Honorable  William  V.  Roth 
The  Honorable  Charles  E.  Grassley 
United  States  Senate 

The  Department  of  Defense  (dod)  has  proposed  that  the  practices  and 
policies  of  the  Office  of  the  Director  of  Operational  Test  and  Evaluation 
(dot&e)  be  modified  to  reduce  the  time  and  cost  of  developing  and  fielding 
new  weapon  systems.  To  help  focus  deliberations  on  dod’s  proposal,  you 
asked  us  to  review  dot&e’s  operations  and  organizational  structure  for 
overseeing  operational  testing.  Specifically,  you  asked  us  to  assess 
(1)  dot&e’s  efforts  and  their  impact  on  the  quality  of  operational  testing 
and  evaluation^  in  dod  and  (2)  the  strengths  and  weaknesses  of  the  current 
organizational  framework  in  dod  for  operational  testing.  As  part  of  our 
review,  we  conducted  13  case  studies  of  the  testing  of  individual  weapon 
systems.  (Our  scope  and  methodology  are  described  in  app.  I,  and  brief 
descriptions  of  the  13  weapon  systems  are  provided  in  app.  II.) 


Background 


In  1983,  Congress  established  dot&e  to  coordinate,  monitor,  and  evaluate 
operational  testing  of  major  weapon  systems.^  As  part  of  the  Office  of  the 
Secretary  of  Defense  (osd),  dot&e  is  separate  from  the  acquisition 
community  that  conducts  developmental  and  operational  testing  and 
therefore  is  in  a  position  to  provide  the  Secretary  and  Congress  with  an 
independent  view.  Congress  created  dot&e  in  response  to  reports  of 
conflicts  of  interest  in  the  acquisition  community’s  oversight  of 
operational  testing  leading  to  inadequate  testing  of  operational  suitability® 
and  effectiveness^  and  the  fielding  of  new  systems  that  performed  poorly. 
(dod’s  system  acquisition  process  is  described  in  app.  HI.) 


^The  term  “operational  test  and  evaluation”  means  (1)  the  field  test,  under  realistic  conditions,  of  any 
item  or  key  component  of  a  weapon  system,  equipment,  or  munition  for  the  purpose  of  determining 
the  effectiveness  and  suitability  of  the  weapon,  equipment,  or  munition  for  use  in  combat  by  typical 
military  users  and  (2)  the  evaluation  of  the  results  of  the  test. 

^P.L.  98-94  sec.  1211(a)(1),  97  Stat.  684.  DOT&E’s  legislation  is  now  codified  at  10  U.S.C.  139. 

®DOD  defines  “operationally  suitable”  as  the  degree  to  which  a  system  can  be  placed  satisfactorily  in 
field  use,  with  consideration  given  to  such  factors  as  availability,  compatibility,  transportability, 
interoperability,  reliability,  wartime  usage  rates,  maintainability,  safety,  and  supportability. 

^DOD  defines  “operationally  effective”  as  the  overall  degree  of  mission  accomplishment  of  a  system 
when  used  by  representative  personnel  in  the  environment  planned  or  expected  for  operational 
employment  of  the  system,  considering  organization,  doctrine,  tactics,  survivability,  vulnerability,  and 
threat. 
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By  law,  DOT&E  serves  as  the  principal  adviser  on  operational  test  and 
evaluation  in  dod  and  bears  several  key  responsibilities,  including 

•  monitoring  and  reviewing  all  operational  test  and  evaluation  in  DOD, 

•  reporting  to  the  Secretary  of  Defense  and  congressional  committees 
whether  the  tests  and  evaluations  of  weapon  systems  were  adequate  and 
whether  the  results  confirmed  that  the  system  is  operationally  suitable  and 
effective  for  combat  before  a  decision  is  made  to  proceed  to  full-rate 
production,  and 

•  submitting  to  the  Secretary  of  Defense  and  congressional  decisionmakers 
an  annual  report  summarizing  operational  test  and  evaluation  activities 
during  the  preceding  fiscal  year. 

In  1993,  Don’s  advisory  panel  on  streamlining  and  codifying  acquisition 
laws®  concluded  that  dot&e  was  impeding  the  goals  of  acquisition  reform 
by  (1)  promoting  unnecessary  oversight,  (2)  requiring  excessive  reporting 
detail,  (3)  inhibiting  the  services’  discretion  in  testing,  and  (4)  limiting 
participation  of  system  contractors  in  operational  tests  where  such 
involvement  is  deemed  necessary  by  the  services.  'The  following  year,  dod 
proposed  legislative  changes  that  would  have  reduced  the  scope  and 
authority  of  dot&e.  In  testimony,  we  opposed  these  changes  because  they 
were  directed  at  perceived  rather  than  documented  problems  and  would 
imdermine  a  key  management  control  over  the  acquisition 
process — ^independent  oversight  of  operational  test  and  evaluation.® 

Although  the  legislative  proposals  were  not  adopted,  in  1995  the  Secretary 
of  Defense  implemented  several  operational  test  and  evaluation  initiatives 
in  the  Department  to  (1)  involve  operational  testers  earlier  in  the 
acquisition  process,  (2)  use  models  and  simulations  effectively, 

(3)  combine  tests  where  possible,  and  (4)  combine  tests  and  training.  'The 
goals  of  these  initiatives  included  saving  time  and  money  by  identifying 
and  addressing  testing  issues  earlier  in  the  acquisition  process;  merging  or 
closely  coordinating  historically  distinct  phases,  such  as  developmental 
and  operational  testing  to  avoid  duplication;  and  using  existing 
technologies  and  training  exercises  to  create  realistic  and  affordable  test 
conditions. 


^Established  under  section  800  of  the  National  Defense  Authorization  Act  for  Fiscal  Year  1991 
(P.L.  101-510, 1990). 

^Acquisition  Reform:  Role  of  Test  and  Evaluation  in  System  Acquisition  Should  Not  Be  Weakened 
(GAO/T-NSIAD-94-124,  Mar.  22, 1994). 
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Results  in  Brief 


Our  review  of  13  case  studies  indicated  that  dot&e  oversight  of  operational 
testing  and  evaluation  increased  the  probability  that  testing  would  be 
more  realistic  and  more  thorough/  Specifically,  dot&e  was  influential  in 
advocating  increasing  the  reliability  of  the  observed  performance  and 
reducing  the  risk  of  unknowns  through  more  thorough  testing;  conducting 
more  realistic  testing;  enhancing  data  collection  and  analysis;  reporting 
independent  findings;  and  recommending  follow-on  operational  test  and 
evaluation  when  suitabihty  or  effectiveness  was  not  fully  demonstrated 
prior  to  initiating  full-rate  production. 

The  independence  of  dot&e — and  its  resulting  authority  to  report  directly 
to  Congress — is  the  foundation  of  its  effectiveness.  That  independence, 
along  with  its  legislative  mandate,  provides  sufficient  freedom  and 
authority  to  exercise  effective  oversight  of  the  operational  testing  and 
evaluation  of  new  systems  before  a  decision  is  made  to  begin  full-rate 
production.  In  the  conduct  of  its  oversight,  dot&e  (1)  executes  its  approval 
authority  over  test  and  evaluation  master  plans  and  operational  test  plans 
and  (2)  provides  independent  annual  and  summary  reports  on  the  test  and 
evaluation  of  individual  weapon  systems  to  the  Secretary  of  Defense  and 
Congress. 

DOT&E  can  reduce  the  risk  that  systems  are  not  adequately  tested  prior  to 
the  fuU-rate  production  decision.  But  dot&e  cannot  ensure  that  (1)  only 
systems  whose  operational  effectiveness  and  suitability  have  been 
demonstrated  through  operational  testing  wiU  proceed  to  the  full-rate 
production  decision  or  (2)  new  fielded  systems  will  accomplish  their 
missions  as  intended  or  that  the  fielded  systems  are  safe,  survivable,  and 
effective.  Moreover,  service  and  acquisition  officials  have  argued  that 
dot&e  does  not  have  the  independent  authority  to  require  and  approve 
service-conducted  follow-on  operational  test  and  evaluation  after  fuU-rate 
production  begins.  In  addition,  the  Office  is  not  currently  required  to 
report  on  whether  new  systems  are  both  operationally  suitable  and 
effective  before  they  are  fielded. 

dot&e  management  must  balance  its  oversight  responsibilities  for 
operational  testing  with  the  broader  acquisition  priorities  of  program 
managers  and  service  test  agencies.  Though  supportive  of  the  Office’s 
mission  and  independence,  program  and  service  representatives 
frequently  considered  the  time,  expense,  and  resources  expended  to 


'^Aspects  of  realism  can  include  (1)  equipment  and  personnel  placed  under  realistic  stress  and 
operational  tempo,  (2)  threat-representative  forces,  (3)  end-to-end  testing,  (4)  realistic  combat  tactics, 
(5)  operationally  realistic  environments  and  targets,  (6)  countermeasured  environments, 

(7)  interfacing  systems,  (8)  terrain  and  environmental  conditions,  and  (9)  contractor  involvement. 
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accommodate  dot&e  concerns  to  be  ill-advised.  Service  officials  contended 
that  the  additional  testing  requested  by  dot&e  was  either  unnecessary  for 
determining  the  operational  effectiveness  or  suitability  of  a  program  or 
unrealistic  in  light  of  the  limitations  in  the  services’  testing  resources. 

DOT&E  must  manage  multiple  oversight,  advisory,  and  coordination 
responsibihties.  Several  current  trends  may  challenge  dot&e’s  ability  to 
manage  its  workload  and  its  ability  to  impact  operational  test  and 
evaluation.  These  trends  include  (1)  service  challenges  to  dot&e’s 
authority  to  require  and  oversee  follow-on  operational  testing  and 
evaluation,  (2)  a  decline  in  resources  available  for  oversight,  (3)  an 
expansion  of  dot&e  involvement  in  activities  other  than  oversight  of  major 
acquisition  programs,  (4)  participation  of  dot&e  in  the  acquisition  process 
as  a  member  of  working-level  integrated  product  teams,  and  (5)  greater 
integration  of  developmental  and  operational  testing.  These  trends  make  it 
imperative  that  dot&e  prioritize  its  workload  to  achieve  a  balance  between 
the  oversight  of  major  defense  acquisition  programs  and  other  initiatives 
important  to  the  quality  of  operational  test  and  evaluation. 


DOT&E  Advocates 
More  Thorough 
Testing  Than  the 
Services 


A  frequent  complaint  among  representatives  of  the  services’  operational 
testing  agencies  was  that  dot&e  frequently  demanded  more  tests  than  were 
proposed  by  the  operational  test  agencies  in  draft  master  plans  or  test 
plans.  Operational  test  agency  representatives  contended  that  the 
additional  testing  was  either  unnecessary  for  determining  the  operational 
effectiveness  or  suitability  of  a  program  or  unrealistic  in  light  of  the 
limitations  in  the  services’  testing  resources.  However,  our  review 
indicated  that  dot&e  urged  more  testing  to  reduce  the  level  of  risk  and 
number  of  unknowns  prior  to  the  decision  to  begin  full  production,  while 
program  and  service  officials  typically  sought  less  testing  and  were  willing 
to  accept  greater  risk  when  making  production  decisions.  The  additional 
testing  DOT&E  advocated,  often  over  the  objections  of  service  testers, 
served  to  meet  the  underlying  objectives  of  operational  testing — to  reduce 
the  uncertainty  and  risk  that  systems  entering  fuU-rate  production  woidd 
not  fulfill  their  requirements. 

The  impact  of  dot&e  oversight  varies  vvith  the  system  imder  development. 
Table  1  summarizes  the  types  of  impacts  that  dot&e  advocated  or 
facilitated  in  operational  testing  among  the  13  cases  we  studied.  While  the 
impacts  vary,  one  consistent  pattern  in  our  case  studies  was  a  reduction  in 
uncertainty  regarding  the  weapon  systems’  suitability  or  effectiveness 
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prior  to  the  full-rate  production  decision.  Each  of  the  impacts  are 
discussed  in  more  detail  in  tables  2-6  and  in  subsequent  sections. 


Table  1:  Types  of  Impacts  on  the  Operational  Testing  of  13  Systems  Due  to  DOT&E  Oversight 

System 

More  testing 
advocated  and 
conducted 

More  realism 
included  in  test 
design 

Enhancements 
made  in  data 
collection  or 
analysis 

DOT&E’S 
conclusion 
deviated  from  the 
service’s 

Follow-on 
operational  test 
and  evaluation 
advocated  and 
planned  or 
conducted 

AH-64D  Longbow 
Apache  helicopter 

X 

X 

X 

X 

X 

ASPJ^  jammer 

X 

X 

C-17A  aircraft 

X 

X 

E-3  AWACS'’  (RSIP'=) 

X 

X 

F-22  fighter 

X 

X 

X 

Javelin  missile 

X 

X 

X 

Joint  STARS'' 

X 

X 

X 

X 

LPD-17  assault  ship 

X 

X 

M1A2  tank 

X 

X 

X 

Sensor  fuzed  weapon 

X 

X 

X 

X 

Standard  missile 

X 

X 

Tomahawk  Weapon 
System 

X 

X 

V-22  aircraft 

X 

Note:  The  absence  of  an  "X"  does  not  necessarily  indicate  the  absence  of  DOT&E  impact.  For 
example,  blanks  may  occur  where  DOT&E  and  the  service  agreed  on  issues;  however,  the 
deterrent  effect  of  DOT&E  oversight  is  unquantifiable.  In  addition,  blanks  may  occur  because  the 
system  has  not  yet  progressed  through  the  entire  acquisition  process. 

^Airborne  Self- Protection  Jammer. 

^Airborne  Warning  and  Control  System. 

'^Radar  System  Improvement  Program. 

^Surveillance  Target  Attack  Radar  System. 


DOT&E  Oversight  Led  to 
More  Testing  Than 
Proposed  by  the 
Operational  Test  Agencies 


Two  of  dot&e’s  typical  concerns  in  reviewing  service  test  plans  are  that  the 
proposed  test  methodologies  enable  (1)  comparisons  of  a  system’s 
effectiveness  through  side-by-side  testing  between  the  existing  and 
modified  systems  and  (2)  assessments  of  a  system’s  reliability  through  a 
sufficient  number  of  test  repetitions.  Table  2  illustrates  examples  of  cases 
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where  additional  testing  was  conducted  at  dot&e’s  insistence  or  with 
dot&e’s  support  to  alleviate  these  and  other  types  of  effectiveness  and 
suitabihty  concerns. 

Table  2:  Examples  of  Programs  That  Expanded  Testing  Due  to  DOT&E  Oversight 

System 

Expanded  testing 

Impact 

AH-64D  Longbow 
Apache  helicopter 

DOT&E  insisted  that  the  Army  include  a  baseline 
AH-64A  company  in  gunnery  and  force-on-force 
exercises  to  ensure  direct  comparability  with  the 
Longbow. 

Testers  were  able  to  demonstrate  the  gunnery 
performance  improvements  of  the  AFI-64D.  These 
improvements  included  that  (1)  the  AFi-64D  had 

300  instances  of  lethality  compared  to  75  for  the 
AH-64A,  (2)  the  AFI-64D  was  approximately  8  times 
more  survivable  than  the  AFI-64A,  and  (3)  the 
AFI-64D  had  zero  fratricide  instances  compared  to 
34  for  the  AH-64A. 

ASPJ  jammer 

In  follow-on  operational  test  and  evaluation  of  the 
F-14D  begun  in  1995,  DOT&E  insisted  that  the 
scope  of  the  test  plan  address  the  ASPJ's 
contribution  to  the  aircraft's  survivability— not 
merely  thejammer's  compatibility  with  the  aircraft's 
avionics.  This  expansion  of  the  scope  necessitated 
an  additional  18  open  air  flight  tests  to  measure  the 
ASPJ's  effectiveness  against  air-to-air  threats  and 
a  requirement  to  gather  suitability  data  pertaining 
to  ASPJ,  including  its  built-in  test  equipment.^ 

The  revised  test  plan  enabled  testers  to  address 
the  critical  operating  issue — that  the  F-1 4D  is  more 
survivable  with  the  ASPJ  as  part  of  its  electronic 
warfare  suite  than  without  it. 

C-17  aircraft 

The  ability  to  safely  perform  a  mass  personnel 
airdrop  while  flying  in  close  formation  is  a  key  Air 
Force  capability  needed  to  conduct  a  strategic 
brigade  airdrop.  DOT&E  insisted  that  an  airdrop  of 
a  brigade  slice  of  personnel  and  equipment  be 
done.  The  Air  Force's  position  was  that  the  airdrop 
was  unnecessary  before  the  full-rate  production 
decision  and  that  the  use  of  the  aircraft  in  airdrops 
would  be  determined  after  the  full-rate  production 
decision. 

DOT&E  forced  testing  that  confirmed  operational 
limitations,  and  the  Army  has  yet  to  approve  mass 
airdrops  of  personnel  from  C-17s  flying  In  close 
formation.  Operational  tests  identified  specific 
problems  with  the  C-1 7's  airdrop  capability — that 
with  the  air  turbulence  created  in  the  wake  of  the 
aircraft,  flying  In  close  formation  can  cause  the 
parachutes  dropping  from  aircraft  to  oscillate, 
partially  deflate,  or  collapse.  These  conditions 
could  result  in  serious  injury  or  death  to 
paratroopers. 

F-22  fighter 

DOT&E  and  the  Air  Force  agreed  to  a  balanced 
approach  of  open-air  testing,  full  mission 
simulation,  and  digital  models  against  then-current 
and  future  threats  in  an  overall  F-22  and  F-15 
effectiveness  analysis. 

The  use  of  multiple  testing  and  evaluation 
techniques  will  reduce  uncertainty  in  system 
effectiveness  more  than  the  Air  Force's  initial 
preference  to  use  test  results  to  support  evaluation 
by  modeling. 

Javelin  missile 

DOT&E  insisted  that  the  system  undergo  additional 
operational  testing  prior  to  the  full-rate  production 
decision  in  1997  because  over  50  design  changes 
had  been  made  to  the  system  since  initial 
operational  test  and  evaluation  in  1993.  The  Army 
claimed  that  successful  passage  of  technical  tests 
was  adequate  assurance  of  suitability  for  combat 
and  did  not  originally  intend  to  conduct  operational 
tests  until  1 998,  over  a  year  after  the  start  of 
full-rate  production.*^ 

The  test  provided  additional  confidence  that  the 
weapon  system's  modifications  had  not  affected 
Javelin's  suitability  for  combat. 

(continued) 
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System 

Expanded  testing 

Impact 

Javelin  missile  (con't) 

Based  on  data  collected  from  initial  operational 
testing,  DOT&E  disagreed  with  the  Army's 
conclusion  that  the  Javelin  was  suitable  for  combat 
and  supported  the  Army's  operational  test  agency 
in  requiring  the  program  manager  to  conduct  an 
operational  test  to  confirm  the  unit's  reliability. 

Before  the  additional  test  was  conducted,  the  Army 
modified  components  of  the  command  launch  unit 
to  increase  its  reliability.  The  subsequent  test 
demonstrated  that  the  modifications  were 
successful.  The  test  also  provided  two  additional 
benefits.  First,  missile  failures  during  the  test  led  to 
discovery  and  correction  of  a  design  flaw  that 
prevented  the  missiles  from  leaving  the  launch 
tube  when  the  gunner  pulled  the  trigger.  Second, 
while  developing  the  test  plan,  DOT&E  discovered 
that  the  Army  had  no  Javelin-specific  tactical 
doctrine  and  recommended  the  Army  study  this 
deficiency.  As  a  result,  the  Army  developed 
operational  tactics  to  guide  officers  in  integrating 
Javelin  with  other  antitank  systems. 

LPD-17  assault  ship 

The  originally  proposed  operational  test  for  the 
LPD-17  consisted  of  at-sea  steaming  and  some 
landing  craft  air  cushion  (LCAC)  operations. 

DOT&E  forced  the  incorporation  of  full-scale 
assault  operations  with  LCACs,  aircraft  ground 
assault  equipment  and  personnel. 

The  expanded  scope  of  the  test  plan  will  more 
closely  encompass  the  range  of  system 
requirements  as  well  as  enhance  the  realism  of  the 
test  scenario. 

Sensor  fuzed  weapon 

DOT&E  insisted  on  a  second  phase  of  operational 
test  and  evaluation  before  the  full-rate  production 
decision  that  the  Air  Force  did  not  want  to  conduct. 

The  additional  testing  of  system  issues  not  fully 
tested  in  the  first  phase  (such  as  additional 
countermeasures,  multiple  releases,  and  an 
alternate  target  formation)  reduced  uncertainty  in 
system  effectiveness  and  reliability, 

Standard  missile  SM-2 

DOT&E  insisted  on  and  obtained  five  fiight  tests  of 
the  User  Operational  Evaluation  System  SM-2 
block  IVA  missile,  a  theater  ballistic  missile 
defense  system.  The  Navy  planned  only  two  at-sea 
safety  flights  against  nonthreat-representative 
targets.  Some  of  the  new  flight  tests  will  be 
conducted  against  threat-representative  targets 
from  the  integrated  AEGIS  system. 

DOT&E's  insistence  on  additional  testing  has 
lowered  the  technical  risk  of  the  program  by 
providing  for  a  series  of  tests  to  establish  system 
level  validation.  These  tests  will  help  to 
demonstrate  the  level  of  reliability  and 
effectiveness  of  the  SM-2  block  IVA  missile. 

^See  Electronic  Warfare  (GAO/NSIAD-96-109R,  Mar.  1, 1996). 

^See  Amny  Acquisition:  Javelin  Is  Not  Ready  for  Multiyear  Procurement  (GAO/NSIAD-96-199, 

Sept.  26,  1996). 

Table  3  illustrates  examples  where  the  design  or  conduct  of  operational 
testing  was  modified  at  dot&e’s  insistence  or  with  dot&e’s  support  to 
increase  the  realism  of  test  conditions  and  reduce  the  uncertainty  of 
system  suitability  or  effectiveness. 


DOT&E  Oversight  Led  to 
More  Realistic  Testing 
Than  Proposed  by  the 
Operational  Test  Agencies 
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Table  3:  Examples  of  Programs  That  Conducted  More  Realistic  Testing  Due  to  DOT&E  Oversight 

System 

Enhanced  reaiism  in  tests 

Impact 

AH-64D  Longbow 

Apache  helicopter 

DOT&E  required  a  demanding  air  defense 
network,  directly  intervening  to  ensure  that 
a  specific  threat  would  be  present  in  the 
force-on-force  trials. 

The  testing  revealed  operational  limitations 
of  the  AH-64D  variant  without  the  fire 
control  radar  and  thereby  raised  the  issue 
of  the  appropriate  mix  of  variants  to 
procure.  The  AH-64D  variant  with  the  fire 
control  radar  was  unable  to  reduce  the  air 
defense  threat  sufficiently  to  allow  the 
variant  without  the  fire  control  radar  to 
move  into  battle  positions  without 
significant  possibility  of  being  engaged  by 
those  air  defense  units. 

E-3  AWACS  (RSIP) 

DOT&E  insisted  that  (1)  mission  crews 
comprise  a  cross  section  of  typical  AWACS 
aircrew  members,  (2)  RSIP  be  employed 
against  an  array  of  actual  Soviet  and  other 
threats,  and  (3)  the  system  be  used  in  eight 
different  terrain  combinations  in  both  the 
United  States  and  Europe. 

Reduced  uncertainty  of  system 
effectiveness  because  (1)  AWACS 
personnel  from  the  engineering  and 
developmental  test  sorties  were  excluded, 
resulting  in  the  use  of  two  test  crews 
comprised  of  a  typical  ratio  of  U.S.  and 
Canadian  deployment  personnel  and  (2) 
actual  threats  and  realistic  environments 
were  incorporated. 

F-22  fighter 

DOT&E  was  instrumental  in  ensuring  that  a 
full  mission  simulator  was  developed  for 
comparison  testing  using  validated 
software  and  hardware,  insisting  that  the 
functionality  and  fidelity  of  the  simulation  be 
validated  by  open  air  flight  data. 

The  credibility  of  the  full  mission  simulator 
(used  to  compare  relative  mission 
effectiveness  of  the  F-15  and  F-22)  will  be 
enhanced. 

DOT&E  insisted  that  the  test  and  evaluation 
master  plan  include  high  tempo 
demonstrations  to  test  the  required  sortie 
generation  rate. 

The  confidence  level  of  the  model's 
prediction  is  enhanced  by  introducing 
surge  data  from  actual  operations. 

Javelin  missile 

DOT&E  required  Army  troops  to  carry  the 
missile  a  representative  distance  during 
missions  and  prior  to  actual  firings  to 
ensure  that  the  missile's  reliability  would 
not  be  affected  by  field  handling. 

The  Army  found  that  missiles  carried  during 
the  test  failed  to  leave  the  launch  tube 
because  of  a  faulty  design  of  the  external 
restraining  pin-wiring  harness.  This  finding 
led  the  Army  to  redesign  the  assembly, 
which  prevented  potential  missile 
malfunctions  in  combat  situations. 

Joint  STARS 

In  the  development  of  the  test  plan,  DOT&E 
encouraged  participation  of  Air  Force  and 
Army  testers  in  training  exercises  at  the 
National  Training  Center  as  a  way  to 
enhance  test  realism. 

Deployment  of  the  system  to  Bosnia 
precluded  testing  at  the  National  Training 
Center,  but  the  test  design  precedent  was 
established. 

(continued) 
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System 

Enhanced  realism  in  tests 

Impact 

Sensor  fuzed  weapon 

During  the  second  phase  of  initial 
operational  test  and  evaluation,  DOT&E 
required  an  extensive  validation  of  the 
infrared  signature  and  the  use  of 
countermeasures,  insisted  on  all-weather 
and  all-altitude  testing  at  numerous  test 
sites;  insisted  on  realistic  and 
comprehensive  countermeasures  testing; 
and  ensured  realistic  targets  were  made 
available  for  testing. 

The  enhanced  realism  of  testing  reduced 
uncertainty  of  system  effectiveness  at  low 
altitudes  and  confirmed  decreased 
effectiveness  as  altitude,  dive  angle,  and 
time  of  flight  increase. 

Standard  missile  SM-2 

During  the  review  of  the  Navy's  draft  test 
and  evaluation  master  plan  for  the  SM-2 
block  IV,  DOT&E  identified  inadequacies  in 
aerial  target  programs  and  required  that 
threat-representative  targets  be  available 
for  operational  testing. 

The  need  for  realistic  aerial  targets  is  a 
significant  issue  cutting  across  all  Navy 
surface  antiair  warfare  programs  such  as 
the  Phalanx  Close-In  Weapon  System  and 
the  Rolling  Airframe  Missile,  as  well  as  the 
various  SM-2  blocks. 

Tomahawk  Weapon  System 

DOT&E  was  Instrumental  in  ensuring  that 
only  ship  crews  were  used  during  the 
testing  of  the  all-up-rounds^  and  the 
Tomahawk  Weapon  Control  System. 

Support  personnel  conducted  testing,  while 
contract  personnel  maintained  the 
equipment  as  they  do  in  actual  operations. 

The  use  of  realistic  operators  reduced 
uncertainty  in  system  reliability  and 
effectiveness. 

V-22  aircraft 

DOT&E  has  emphasized  the  effects  of  the 
V-22  downwash  on  personnel  and  material 
in  the  vicinity  of  the  hovering  aircraft  and 
the  need  to  test  in  more  realistic  ship  and 
landing  zone  environments. 

The  test  program  has  been  revised  to 
conduct  downwash  testing  in  1997  rather 
than  1999  to  address  the  concerns  of 
DOT&E  and  others. 

^Each  Tomahawk  missile  variant  is  contained  within  a  pressurized  canister  to  form  an 
all-up-round. 

DOT&E  Oversight  Led  to 
Changes  in  the  Data 
Analysis  Plan 


DOT&E  can  insist  on  or  support  changes  in  data  analysis  plans  that  provide 
more  meaningful  analyses  for  decisionmakers.  Table  4  illustrates  instances 
in  which  dot&e  altered  the  proposed  data  collection  or  analysis  plans  to 
enhance  the  reUability  or  utility  of  the  test  data. 
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Table  4:  Examples  of  Programs  in  Which  Changes  Were  Made  in  the  Data  Analysis  Pian  Due  to  DOT&E  Oversight 

System 

Changes  in  data  analysis  plan 

Impact 

AH-64D  Longbow 

Apache  helicopter 

DOT&E  insisted  on  performance  criteria  to 
assess  the  superiority  of  the  AH-64D  over 
the  AH-64A.  The  criteria— a  20-percent 
improvement— had  not  formally  been 
included  in  the  test  and  evaluation  master 
plan.  DOT&E  required  measures  that 
addressed  the  number  of  targets  killed  and 
helicopters  lost. 

DOT&E  input  allowed  testers  to  more 
accurately  compare  the  AFI-64D  to  the 
AFI-64A  in  quantifiable  categories  of 
lethality,  survivability,  and  fratricide. 

ASPJ  jammer 

DOT&E  required  the  Navy  to  test  the  ASPJ 
against  the  type  of  missile  that  shot  down 
an  F-16  over  Bosnia  in  June  1995.^ 

The  Navy  determined  that  the  ASPJ  was 
effective  against  that  threat. 

DOT&E  was  instrumental  in  establishing  a 
requirement  to  gather  suitability  data  on  its 
built-in  test  equipment.  While  the  contractor 
reported  improvement  in  previously 
unreliable  built-in  test  equipment.  DOT&E 
questioned  the  data  collection  and 
interpretation. 

Independent  oversight  of  ASPJ's  suitability 
assessment  confirmed  ongoing  concerns 
with  system  reliability. 

E-3  AWACS  (RSIP) 

DOT&E  insisted  that  service  personnel  be 
trained  to  operate  contractor  data 
extraction  systems,  thereby  removing  the 
contractor  from  the  process  and  ensuring 
data  integrity.  DOT&E  reviewed  a  major 
radar  failure  and  discovered  an  error  in  the 
technical  path  described  by  the  service. 

Reduced  uncertainty  of  system 
effectiveness  because  the  contractor  was 
removed  from  data  processing  ensuring 
test  integrity. 

Joint  STARS 

DOT&E  insisted  that  the  Air  Force  modify 
its  original  technical  requirements  to 
include  measures  of  effectiveness  that 
directly  addressed  the  missions  of 
surveillance,  targeting,  and  battlement 
management.  DOT&E  stressed 
differentiation  between  user  and  system 
requirements. 

The  change  in  test  measures  resulted  in 
test  data  that  were  more  operationally 
relevant  to  system  effectiveness. 

LPD-17  assault  ship 

DOT&E  insisted  on  measures  of 
effectiveness  that  addressed  the 
movement  of  men  and  equipment  ashore 
rather  than  the  Navy's  original  requirements 
that  focused  on  technical  specifications. 

The  change  in  test  measures  will  result  in 
test  data  that  are  more  operationally 
relevant  to  system  effectiveness. 

(continued) 
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System 

Changes  in  data  analysis  plan 

Impact 

M1A2  tank 

DOT&E  required  that  the  Army  use  credible 
data  for  the  determination  of  reliability  In 
follow-on  operational  test  and  evaluation. 

The  Army  proposed  the  use  of  failures  and 
other  secondary  measures  that  would  not 
provide  a  credible  basis  for  reversing  the 
results  of  initial  operational  test  and 
evaluation.  DOT&E  insisted  that  the 
operational  testing  be  conducted  to 
compare  the  M1 A2  with  the  Ml  A1 .  Several 
improvements  in  the  Ml  A2  addressed 
command  and  control  that  could  not  be 
directly  measured.  By  conducting  several 
operations  with  both  tanks,  the  difference  In 
movements  and  coordination  could  be 
examined  to  determine  the  value  of  the 
command  and  control  improvements.  By 
adding  uncertainty  to  test  scenarios, 

DOT&E  enabled  the  Army  operational  test 
agency  a  means  to  identify  differences 
between  the  M1A1  and  M1 A2  models. 

Reduced  uncertainty  of  Improved 
effectiveness  and  suitability  of  the  Ml  A2 
compared  with  the  Ml  A1 . 

Tomahawk  Weapon  System 

DOT&E  was  instrumental  in  ensuring  that 
the  effectiveness  of  mission  planning 
systems  was  validated  using  high-fidelity 
models  and  simulations  and  that  bit-by-bit 
checks  were  conducted  to  validate  the 
effectiveness  of  functional  operations  of  the 
planning  system. 

More  rigorous  data  collection  and 
validation  reduced  uncertainty  of  system 
effectiveness. 

®See  Airborne  Self-Protection  Jammer  (GAO/NS1AD-97-46R,  Jan.  29,  1997). 

DOT&E  Interpreted  the 
Results  of  Some  Testing 
Less  Favorably  Than  the 
Operational  Test  Agencies 


dot&e’s  independent  analysis  of  service  test  data  may  confirm  or  dispute 
the  results  and  conclusions  reported  by  the  service.  In  the  cases  described 
in  table  5,  dot&e’s  analysis  of  service  operational  test  and  evaluation  data 
resulted  in  divergent,  often  less  favorable  conclusions  than  those  reached 
by  the  service. 
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Table  5:  Examples  of  Programs  in  Which  DOT&E  and  Service  Conciusions  Differed 

System 

Conflicting  test  results 

Impact 

AH-64D  Longbow 

Apache  helicopter 

DOT&E's  independent  analysis  of  the  test  data 
identified  a  predominant  firing  technique  that  had 
not  previously  been  identified  as  useful.  Though 
the  technique  was  never  anticipated  to  be  used  so 
extensively  and  had  not  been  considered  in  the 
development  of  the  Longbow's  tactics,  techniques, 
and  procedures,  DOT&E  determined  that  over  half 
of  the  operational  test  engagements  were 
conducted  using  this  technique.  Nonetheless,  this 
revelation  was  not  in  the  Army  test  report. 

The  Army  will  conduct  a  series  of  simulations  and 
additional  missile  firings  to  determine  the  factors 
affecting  the  overall  effectiveness  of  the  technique 
and  its  relative  effectiveness  to  the  primary  modes 
of  engagement,  thereby  increasing  certainty  in 
system  effectiveness. 

Javelin  missile 

DOT&E  did  not  use  reliability  data  from  the 
pre-initial  operational  test  and  evaluation  period 
because  the  data  were  not  realistic;  as  a  result, 
DOT&E  found  the  command  launch  unit  failed  to 
meet  its  reliability  criteria,  differing  from  the  Army's 
report. 

The  Army  made  numerous  design  changes  to  the 
launch  unit  and  round  before  the  contractor 
initiated  low-rate  production. 

Joint  STARS 

DOT&E  disagreed  with  the  Air  Force  operational 
test  agency's  positive  assessment  of  the 
operational  suitability  and  effectiveness  of  Joint 
STARS  following  its  deployment  to  Operation  Joint 
Endeavor.  DOT&E  concluded  that  Joint  STARS  met 
one  of  three  critical  operational  effectiveness 
issues— with  limitations,  while  the  other  two 
effectiveness  issues  could  not  be  determined. 
Overall,  the  Air  Force's  conclusion  was  "suitable 
with  deficiencies";  DOT&E's  conclusion  was  "as 
tested  is  unsuitable."®  DOT&E  and  the  Air  Force 
operational  test  agency  also  disagreed  on  how  to 
report  data  when  terrain  masking  occurred. 

DOT&E  objected  to  the  Air  Force's  phrasing 
"nothing  significant  to  report,"  when  in  fact  nothing 
could  be  seen. 

DOT&E's  Beyond-LRIP  report  indicated  not  only 
the  Joint  STARS'  disappointing  test  results  but  also 
the  need  for  extensive  follow-on  operational  test 
and  evaluation.  Subsequently,  the  Joint  STARS 
acquisition  decision  memorandum  required  that 
the  test  and  evaluation  master  plan  be  updated 
and  that  follow-on  operational  test  and  evaluation 
address  the  deficiencies  identified  in  initial 
operational  test  and  evaluation  by  DOT&E. 

M1A2  tank 

DOT&E  evaluated  the  tank  as  not  operationally 
suitable— a  finding  at  odds  with  Army  testers. 
DOT&E  determined  that  the  tank  was  unreliable 
and  unsafe  due  to  uncommanded  turret 
movements,  hot  surfaces  that  caused  contact 
burns,  and  inadvertent  firing  of  the  .50  caliber 
machine  gun. 

Follow-on  operational  test  and  evaluation  was 
conducted  to  determine  if  the  Army's  design 
changes  had  improved  the  system.  The  suitability 
problems  persisted  and  the  follow-on  operational 
test  and  evaluation  was  suspended.  New  design 
changes  were  made  and  a  second  follow-on 
operational  test  and  evaluation  was  conducted, 
which  determined  that  the  safety  issues  were 
resolved  and  that  the  tank  is  now  operationally 
suitable. 

Sensor  fuzed  weapon 

Based  on  the  results  of  the  first  phase  of 
operational  test  and  evaluation  ending  in  1992,  the 
Air  Force  concluded  that  the  sensor  fuzed  weapon 
was  "suitable  and  effective  for  combat."  In  contrast. 
DOT&E  concluded  from  the  same  tests  that  the 
system  was  only  "potentially  operationally  effective 
and  suitable." 

As  a  result  of  the  unresolved  issues  in  1992,  a 
second  phase  of  operational  test  and  evaluation 
was  planned  and  executed,  leading  DOT&E  to 
conclude  in  1996  that  the  system  was  operationally 
suitable  and  effective— when  employed  at  low 
altitude  using  level  or  shallow  angle  dive  deliveries. 

(Table  notes  on  next  page) 
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^See  Tactical  Intelligence:  Joint  STARS  Full-Rate  Production  Decision  Was  Premature  and  Risky 

(GAO/NSiAD-97.68,  Apr.  25,  1997). 

DOT&E  Recomniended  When  dot&e  concludes  that  a  weapon  system  has  not  fully  demonstrated 

Follow-On  Operational  operational  suitability  or  effectiveness,  or  if  new  testing  issues  arise  during 

Test  and  Evaluation  initial  operational  test  and  evaluation,  it  may  recommend  that  follow-on 

operational  test  and  evaluation  be  done  after  the  fuU-rate  production 
decision.  Table  6  identifies  follow-on  operational  test  and  evaluation  that 
DOT&E  supported. 

Table  6:  Examples  of  Programs  in  Which  DOT&E  Called  for  Follow-On  Operational  Test  and  Evaluation 

System 

Advocated  follow-on  operational  test  and 
evaluation 

Impact 

AH-64D  Longbow 

Apache  helicopter 

DOT&E  sought  follow-on  operational  test  and 
evaluation  to  characterize  the  Hellfire  missile's 
performance  when  using  lock-on  before 
launch-inhibit  technique.  This  method  of 
engagement  enables  crews  to  immediately  take 
cover  after  target  detection  and  fire  at  moving 
targets  from  those  covered  locations.  This  method 
was  used  in  over  half  of  the  operational  test 
engagements,  though  it  had  not  been  considered 
sufficiently  significant  to  incorporate  in  the 
Longbow's  tactics,  techniques,  and  procedures. 

The  use  of  this  technique  was  not  fully  anticipated 
prior  to  initial  operational  test  and  evaluation,  Its 
use  provided  an  unexpected  level  of  survivability 
for  the  AH-64D  crews.  This  technique  had  been 
subjected  to  little,  if  any,  developmental  testing. 
Further  testing  will  establish  its  probability  of  hit. 

The  Army  operational  test  agency  plans  to  fire  8  to 
10  missiles  in  August  1998. 

C-17A  aircraft 

DOT&E  urged  follow-on  operational  test  and 
evaluation  to  demonstrate  the  system's  ability  to 
meet  operational  readiness  objectives,  including 
combination  and  brigade  airdrops,  and  software 
maturity. 

The  Air  Force  has  undertaken  further  testing  with 
the  Army  to  overcome  system  deficiencies  and 
demonstrate  effectiveness.  The  Army  is  formulating 
a  time  requirement  of  about  30  minutes  for 
completing  a  strategic  airdrop.  The  C-17  currently 
has  a  5.5  minute  aircraft  separation  restriction  that 
essentially  prohibits  formation  flying  and  therefore 
requires  2.5  hours  to  complete  a  strategic  airdrop. 
This  resulted  in  continuing  efforts  to  resolve  these 
operational  limitations. 

F-22  aircraft 

DOT&E  Insisted  that  the  test  and  evaluation  master 
plan  require  follow-on  operational  test  and 
evaluation  on  two  capabilities  that  will  not  be 
released  until  after  initial  operational  test  and 
evaluation:  employment  of  the  Joint  Direct  Attack 
Munition  and  Cruise  Missile  Defense. 

The  commitment  to  test  these  capabilities  is 
formally  acknowledged. 

Joint  STARS 

DOT&E  stated  in  its  Joint  STARS  Beyond-LRIP 
report  that  only  18  of  71  performance  criteria 
tested  were  demonstrated  by  the  system  and  that 
further  testing  was  required  for  the  remaining  53. 

The  Joint  STARS  acquisition  decision 
memorandum  directed  additional  testing  to 
address  suitability  deficiencies  in  logistics  and 
software. 

(continued) 
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System 

Advocated  follow-on  operational  test  and 
evaluation 

Impact 

Ml  A2  tank 

The  Ml  A2,  during  initial  operational  test  and 
evaluation  in  1993,  failed  to  meet  the  combat 
mission  reliability  threshold,  encountered  an 
excessive  number  of  battery  failures,  consumed 

15  percent  more  fuel,  exhibited  uncommanded 
main  gun/turret  movements  and  inadvertent  .50 
caliber  machine-gun  firing  that  made  the  tank 
unsafe.  DOT&E,  through  a  Secretary  of  Defense 
letter  accompanying  the  Ml  A2  Beyond-LRIP  report 
to  Congress,  required  follow-on  operational  test 
and  evaluation  on  Ml  A2  suitability  issues  when  the 
Army  claimed  it  was  unnecessary. 

The  Army  executed  a  program  to  correct  the 
deficiencies  identified  during  initial  operational  test 
and  evaluation  and  conducted  follow-on 
operational  test  and  evaluation  in  1995.  Suitability 
issues,  such  as  uncommanded  turret  movement 
and  power  loss,  were  again  experienced.  The 
follow-on  operational  test  and  evaluation  was  put 
on  hold  until  additional  corrective  actions  could  be 
applied.  Follow-on  operational  test  and  evaluation 
resumed  in  July  1 996.  The  safety  problems  were 
found  to  have  been  addressed  by  the  design 
changes,  and  there  were  no  observed  instances  of 
the  problems  experienced  during  initial  or 
beginning  follow-on  operational  test  and  evaluation. 

Sensor  fuzed  weapon 

The  test  and  evaluation  master  plan  for  the  second 
phase  of  operational  test  and  evaluation  specified 
a  series  of  follow-on  operational  test  and 
evaluations  that  would  address  how  well  the 
addition  of  the  Wind  Compensated  Munition 
Dispenser  and  the  preplanned  product 
improvements  will  rectify  system  limitations. 

Follow-on  operational  test  and  evaluation  ensures 
further  investigation  of  system  limitations  known  at 
the  time  of  the  full-rate  production  decision. 

The  existence  of  a  healthy  difference  of  opinion  between  dot&e  and  the 
acquisition  community  is  a  viable  sign  of  robust  oversight.  In  nearly  all  of 
the  cases  we  reviewed,  the  services  and  dot&e  cited  at  least  one  testing 
controversy.  For  example,  services  differ  on  how  they  view  the 
relationship  between  operational  testing  and  their  development  of  tactics, 
techniques,  and  procedures.  In  addition,  dot&e’s  ability  to  independently 
view  the  development  and  testing  of  new  systems  across  the  services 
brings  value  to  the  context  of  testing.  However,  several  current  trends 
have  the  potential  to  adversely  affect  dot&e’s  independence  and  its  ability 
to  affect  operational  test  and  evaluation,  including  (1)  service  challenges 
to  dot&e’s  authority  to  require  and  oversee  follow-on  operational  testing 
and  evaluation,  (2)  declining  resources  available  for  oversight,  (3)  the 
management  of  limited  resources  to  address  competing  priorities, 

(4)  dot&e’s  participation  in  the  acquisition  process  as  a  member  of  the 
program  manager’s  working-level  integrated  product  teams,  and 

(5)  greater  integration  of  developmental  and  operational  testing,  dot&e’s 
impact  on  operational  testing  is  dependent  upon  its  ability  to  manage 
these  divergent  forces  while  maintaining  its  independence. 


Strengths  and 
Weaknesses  of 
Current 
Organizational 
Framework 
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Independence  Is  the  Key  to 
DOT&E’s  Effectiveness 


Although  the  acquisition  community  has  three  central 
objectives — ^performance,  cost,  and  schedule — dot&e  has  but  one: 
operational  testing  of  performance.  These  distinct  priorities  lead  to  testing 
disputes.  Characteristically,  the  disputes  for  each  system  we  reviewed 
revolved  aroimd  questions  of  how,  how  much,  and  when  to  conduct 
operational  testing,  not  whether  to  conduct  operational  testing.  Conflicts 
encompassed  issues  such  as  (1)  how  many  and  what  types  of  tests  to 
conduct;  (2)  when  testing  should  occm;  (3)  what  data  to  collect,  how  to 
collect  it,  and  how  best  to  analyze  it;  and  (4)  what  conclusions  were 
supportable,  given  the  analysis  and  limitations  of  the  test  program.  The 
foimdation  of  most  disputes  lay  in  different  notions  of  the  costs  and 
benefits  of  testing  and  the  levels  of  risk  that  were  acceptable  when  making 
full-rate  production  decisions,  dot&b  consistently  lurged  more  testing  (and 
consequently  more  time,  resomces,  and  cost)  to  reduce  the  level  of  risk 
and  number  of  imknowns  before  the  decision  to  proceed  to  full-rate 
production,  while  the  services  consistently  sought  less  testing  and 
accepted  more  risk  when  making  production  decisions.  Among  our  case 
studies,  these  divergent  dispositions  frequently  led  to  healthy  debates 
about  the  optimal  test  program,  and  in  a  small  number  of  cases,  the 
differences  led  to  contentious  working  relations. 

In  reviews  of  individual  weapon  systems,  we  have  consistently  found  that 
testing  and  evaluation  is  generally  viewed  by  the  acquisition  conununity  as 
a  requirement  imposed  by  outsiders  rather  than  a  management  tool  to 
identify,  evaluate,  and  reduce  risks,  and  therefore  a  means  to  more 
successful  programs.  Developers  are  frustrated  by  the  delays  and  expense 
imposed  on  their  programs  by  what  they  perceive  as  overzealous  testers. 
The  program  office  strives  to  get  the  program  into  production  despite 
uncertainties  that  the  system  will  work  as  promised  or  intended. 
Therefore,  reducing  troublesome  parts  of  the  acquisition  process — ^such  as 
operational  testing — ^is  viewed  as  a  means  to  reduce  the  time  required  to 
enter  production. 

Nonetheless,  the  commanders  and  action  officers  within  the  service 
operational  test  agencies  were  nearly  unanimous  in  their  support  for  an 
independent  test  and  evaluation  office  within  osd.  For  example,  the 
Commander  of  the  Army’s  Operational  Test  and  Evaluation  Command 
commended  the  style  and  orientation  of  the  current  dot&e  Director  and 
affirmed  the  long-term  importance  of  the  office  and  its  independent 
reporting  responsibilities  to  Congress.  'The  Commander  of  the  Navy’s 
Operational  Test  and  Evaluation  Force  stated  that  the  independence  of 
both  DOT&E  and  the  operational  test  agency  was  an  essential  element  in 
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achieving  their  common  goal  of  ensuring  that  new  programs  pass 
sufficiently  rigorous  and  realistic  operational  testing  prior  to  fielding.  The 
Commander  of  the  Air  Force’s  Operational  Test  and  Evaluation,  while 
critical  of  DOT&E  oversight  of  several  major  weapon  systems,  said  that  the 
services  were  well  served  by  dot&e’s  potential  to  independently  report  to 
Congress.  Moreover,  nearly  all  the  operational  test  agency  action  officers 
we  interviewed  participate  in  the  integrated  product  teams  with  the  dot&e 
action  officers  and  recognized  the  value  of  the  Office’s  independent 
oversight  role.  The  action  officers  within  the  service  testing  organizations 
also  have  a  degree  of  independence  that  enables  them  to  represent  the 
future  users  of  systems  developed  in  the  acquisition  community.  'These 
action  officers  stated  that  their  ability  to  voice  positions  unpopular  with 
the  acquisition  community  was  strengthened  when  dot&e  separately 
supported  their  views. 

In  discussions  with  over  three  dozen  action  officers  and  analysts 
responsible  for  the  13  cases  we  reviewed,  the  independence  of  dot&e 
emerged  as  the  fundamental  condition  to  enable  effective  and  efficient 
oversight.  The  foundation  of  interagency  (i.e.,  dot&e  and  service 
operational  test  agencies)  relations  is  based  on  the  independence  of  dot&e, 
its  legislative  mandate,  and  its  independent  reporting  to  Congress,  dot&e  is 
outside  the  chain  of  command  of  those  responsible  for  developing  and 
testing  new  systems.  The  services  need  to  cooperate  with  dot&e  primarily 
because  the  Office  must  approve  all  test  and  evaluation  master  plans  and 
operational  test  plans.  Moreover,  dot&e  independently  reports  on  the 
operational  suitability  and  effectiveness  at  a  system’s  full-rate  production 
milestone,  a  report  that  is  sent  separately  to  Congress. 


Unfavorable  Reports  on 
Operational  Testing  Do  Not 
Always  Inhibit  Full-Rate 
Production 


dot&e’s  report  on  a  system’s  operational  suitability  and  effectiveness  is 
only  one  of  several  inputs  considered  before  the  full-rate  production 
decision  is  made.  An  unfavorable  dot&e  report  does  not  necessarily 
prevent  full-rate  production.  In  each  of  the  cases  cited  below,  an 
affirmative  fuU-rate  production  decision  was  made  despite  a  dot&e  report 
concluding  that  the  system  had  not  demonstrated  during  operational  test 
and  evaluation  that  it  was  both  operationally  suitable  and  operationally 
effective: 


•  Full-rate  production  of  the  M1A2  tank  was  approved  despite  dot&e’s 
report  that  found  the  system  unsuitable. 

•  Full-rate  production  of  Joint  STARS  was  approved,  though  the  system 
demonstrated  ordy  limited  effectiveness  for  “operations  other  than  war” 
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and  found  “as  tested  is  unsuitable.”  Only  18  of  the  71  performance  criteria 
were  met;  53  others  required  more  testing. 

•  FuU-rate  production  of  the  C-17  Airlifter  was  approved  despite  a  number 
of  operational  test  and  evaluation  deficiencies,  including  immature 
software  and  failure  to  meet  combination  and  brigade  airdrop  objectives. 


Services  Contest  DOT&E 
Oversight  of  Follow-on 
Operational  Test  and 
Evaluation 


The  services  contend  that  dot&e  does  not  have  authority  to  insist  on,  or 
independently  approve  the  conduct  of,  follow-on  operational  test  and 
evaluation.  However,  in  several  of  the  systems  we  reviewed,  dot&e 
overcame  service  opposition  and  monitored  follow-on  operational  test  and 
evaluation.  It  used  several  means  to  achieve  success,  such  as 
(1)  incorporating  foUow-on  operational  test  and  evaluation  in  test  and 
evaluation  master  plans  developed  and  approved  prior  to  the  full-rate 
production  decision  milestone;  (2)  persuading  the  Secretary  of  Defense  to 
specify  foUow-on  operational  test  and  evaluation,  and  dot&b’s  oversight 
role,  in  the  fuU-rate  production  acquisition  decision  memorandiun;  and 
(3)  citing  pohcy,  based  on  title  10,  that  entitles  dot&e  to  oversee 
operational  test  and  evaluation  whenever  it  occurs  in  the  acquisition 
process.® 

Nonetheless,  dot&e  action  officers  stated  that  the  service’s  acceptance  of 
dot&e’s  role  in  foUow-on  operational  test  and  evaluation  varies  over  time, 
by  service  and  acquisition  system,  and  is  largely  dependent  upon  the 
convictions  of  executives  in  both  the  services  and  dot&e.  Among  the  cases 
reviewed  in  this  report,  the  services  offered  a  variety  of  arguments  against 
dot&e’s  having  a  role  m  follow-on  operational  test  and  evaluation.  They 
specifically  asserted  the  following: 

•  DOT&E  need  not  be  involved  because  the  scope  of  follow-on  operational  ■ 
test  and  evaluation  is  frequently  less  encompassing  than  initial  operational 
test  and  evaluation.  Follow-on  operational  test  and  evaluation  has  been 
characterized  as  testing  by  the  user  to  determine  the  strengths  and 


®In  March  1997  DOT&E  issued  the  “Policy  on  DOT&E  Oversight  of  Systems  in  Follow-on  Operational 
Test  and  Evaluation  (FOT&E).”  The  Director  stated  that  10  U.S.C.  section  139  provides  DOT&E  with 
the  authority  to  oversee  follow-on  operational  test  and  evaluation.  Specifically,  DOT&E  shall  oversee 
follow-on  operational  test  and  evaluation  to  (1)  refine  estimates  made  during  operational  test  and 
evaluation,  (2)  complete  initial  operational  test  and  evaluation  activity,  (3)  verify  correction  of 
deficiencies,  (4)  evaluate  significant  changes  to  design  or  employment,  and  (5)  evaluate  the  system  to 
ensure  that  it  continues  to  meet  operational  needs  and  retains  effectiveness  in  a  substantially  new 
environment  or  against  a  new  threat.  The  Director  elaborated  by  specifying  that  normal  DOD  5000. 2R 
documental  and  approval  requirements  apply. 


Page  17 


GAO/NSIAD-98-22  DOT&E  Impact 


B-276799 


weaknesses  of  the  system  and  to  determine  ways  to  compensate  for,  or 
fix,  shortcomings  observed  in  initial  operational  test  and  evaluation.® 

•  Title  10  provides  dot&e  with  the  authority  to  monitor  and  review — ^but  not 
necessarily  approve — service  follow-on  operational  test  and  evaluation 
plans.*® 

•  Follow-on  operational  test  and  evaluation  is  unnecessary  when  a  system  is 
found  to  be  operationally  effective  and  suitable  during  initial  operational 
test  and  evaluation — even  though  dot&e  does  not  concur.** 

A  clear  distinction  between  dot&e  oversight  in  follow-on  operational  test 
and  evaluation  versus  initial  operational  test  and  evaluation  is  that  dot&e 
is  not  required  to  report  follow-on  operational  test  and  evaluation  results 
to  Congress  in  the  detailed  manner  of  the  Beyond-LRIP  report.  Therefore, 
even  if  foUow-on  operational  test  and  evaluation  is  conducted  to  assess 
modifications  to  correct  effectiveness  or  suitability  shortcomings  reported 
to  Congress  in  the  Beyond-LRiP  report,  there  is  no  requirement  that 
Congress  receive  a  detailed  accounting  of  the  impact  of  these 
modifications. 


DOT&E’s  Resources  Are 
Declining 


dot&e’s  primary  asset  to  conduct  oversight — ^its  cadre  of  action 
officers — ^has  decreased  in  size  throughout  the  decade.  This  creates  a 
management  challenge  for  the  Office  because  at  the  same  time  staff  has 
decreased,  the  number  of  programs  overseen  by  dot&e  has  increased.  As 
illustrated  in  table  7,  authorized  staffing  declined  from  48  in  fiscal 
year  1990  to  41  in  fiscal  year  1997,  as  did  funding  (in  constant  dollars) 
from  $12,725,000  in  fiscal  year  1990  to  $11,437,000  in  fiscal  year  1997.  The 
decline  in  dot&e  funding  is  consistent  with  the  general  decline  in  dod 
appropriations  during  this  period.  However,  since  fiscal  year  1990,  while 
the  authorized  staffing  to  oversee  operational  test  and  evaluation  has 
dechned  by  14.6  percent,  the  number  of  systems  on  the  oversight  list  has 
increased  by  17.7  percent. 


^In  the  case  of  Joint  STARS,  the  acquisition  decision  memorandum  required  the  Air  Force  and  the 
Army  to  update  a  test  and  evaluation  master  plan  for  OSD  approval — ^but  did  not  require  DOT&E 
approval.  Moreover,  the  Director  of  Air  Force  Test  and  Evaluation  termed  post-milestone  III  testing  as 
“regression  testing”  and  emphasized  that  DOT&E  had  no  oversight  role. 

^®In  two  case  study  systems,  the  C-17  and  Joint  STARS,  the  Air  Force  provided  DOT&E  with  a  copy  of 
its  follow-on  operational  test  and  evaluation  test  plans  for  review  but  did  not  allow  sufficient  time  and 
had  no  expectation  that  DOT&E  would  approve  the  plans  prior  to  the  initiation  of  testing. 

^^The  acquisition  decision  memorandum  for  the  M1A2  tank  required  the  Army  to  conduct  follow-on 
operational  test  and  evaluation  (with  DOT&E  oversight)  on  safety  and  suitability  shortcomings 
identified  by  DOT&E  in  initial  operational  test  and  evaluation,  though  the  Army  had  already 
determined  that  the  system  was  operationally  suitable  as  tested. 
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Table  7:  DOT&E  Staffing  and  Funding 

Dollars  in  thousands 


_ Fiscal  year _ _ 

1990  1991  1992  1993  1994  1995  1996  1997 

$12,725  $13,550  $12,836  $12,333  $11,450  $12,501  $12,183  $11,437 

48  46  44  44  43  43^  42  41 

186  207  204  191  199  202  219  219 

^Funding  for  operational  test  and  evaluation  program  element  only:  funding  provided  for  the  live 
fire  test  and  evaluation  program  element  assumed  by  DOT&E  beginning  in  fiscal  year  1 995  is  not 
reflected  in  the  funding  data  for  fiscal  years  1995-97. 

“'The  authorized  end  strength  for  DOT&E  beginning  in  fiscal  year  1 995  increased  by  four  a  result 
of  the  congressionally  directed  (Federal  Acquisition  Streamlining  Act  of  1994,  P.L.  103-355)  move 
of  live  fire  test  and  evaluation  responsibilities  to  DOT&E.  Since  these  positions  are  dedicated  to 
live  fire  testing  and  not  operational  testing,  their  numbers  are  not  reflected  in  this  table. 


With  declining  resources,  dot&e  must  manage  competing  priorities  related 
to  its  oversight,  advisory,  and  coordination  responsibilities,  dot&e  must 
balance  the  continuing  need  to  allocate  resources  to  these  different 
priorities  while  not  being  perceived  as  having  lost  any  independence. 

DOT&E  management  has  flexibility  in  defining  some  portion  of  the  scope  of 
its  oversight  and  has  continued  to  electively  oversee  a  substantial  number 
of  nonmajor  defense  acquisition  programs  and  assumed  a  leading  role  in 
advocating  an  examination  of  the  modernization  needs  of  the  test  and 
evaluation  infrastructure. 

DOT&E  Continues  to  Oversee  a  Between  fiscal  year  1990  and  1996,  the  number  of  nonmajor  acquisition 
Substantial  Number  of  programs  overseen  annually  by  dot&e  ranged  between  19  and  43.  In  fiscal 

Nonmajor  Programs  year  1996,  when  the  oversight  list  reached  a  peak  of  219, 1  of  every  8 

programs  was  listed  at  the  discretion  of  dot&e.  Thus,  during  this  period 
when  the  resources  to  oversee  operational  testing  declined  and  acquisition 
reforms  have  placed  additional  burdens  on  oversight  staff,  the  directors  of 
DOT&E  continued  to  place  extra  responsibility  on  their  staff  by  augmenting 
the  required  oversight  of  major  acquisition  programs  with  a  substantial 
number  of  optional  systems. 

Despite  a  relative  decline  in  resources  for  oversight,  dot&e  management 
has  also  elected  to  assume  “a  larger  role  in  test  resource  management 
planning  and  leadership  in  an  attempt  to  achieve  much-needed  resource 
modernization.”^^  Although  the  Director  is  designated  as  the  principal 

^^Director,  Operational  Test  and  Evaluation,  FTQS  Report,  March  1996,  p.  1-6. 


DOT&E’s  Limited 
Resources  Must  Address 
Competing  Priorities 


Funding^ 

Authorized  staffing 
Oversight  programs 
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DOT&E  Participation  in 
Working-Level  Integrated 
Product  Teams  Has  the 
Potential  to  Complicate 
Independence 


adviser  to  the  Secretary  of  Defense  and  the  Under  Secretary  of  Defense  for 
Acquisition  and  Technology  on  operational  test  and  evaluation,  including 
operational  test  facilities  and  equipment,*^  assuming  the  larger  role 
defined  by  dot&e  may  be  at  the  expense  of  its  testing  oversight  mission 
and  perception  of  independence.  The  dot&e  Director  is  now  an  adviser  to 
the  Central  Test  and  Evaluation  Investment  Program  and  previously 
served  as  Chairman  of  the  Test  and  Evaluation  Committee.  The  Committee 
is  responsible  for  the  investment  program  and  presides  over  the  planning, 
programming,  and  budgeting  for  development  and  operational  test 
resources.  When  the  Director  served  as  chairman,  we  questioned  whether 
these  ties  created  the  perception  that  the  Director  was  not  independent 
from  developmental  testing. This  issue  may  resurface  as  dot&e  seeks  a 
larger  role  in  test  resource  management  planning.  Also,  as  the  emphasis, 
cost,  and  time  for  operational  test  and  evaluation  are  increasingly 
questioned  in  the  drive  to  streamline  acquisition,  and  as  oversight  assets 
are  stretched,  new  dot&e  initiatives  may  stress  the  Office’s  capacity  to 
manage  oversight  effectively. 

In  May  1995,  the  Secretary  of  Defense  directed  dod  to  apply  the  integrated 
product  and  process  development  concept — using  integrated  product 
teams — throughout  the  acquisition  process.  The  revised  dod  acquisition 
regulations  (dod  5000.2-R  March  1996)  also  addressed  the  use  of 
empowered  integrated  product  teams  at  the  program  office  level,  dot&e 
action  officers  participate  as  members  of  the  working-level  integrated 
product  teams,  and  the  dot&e  Director  is  a  member  of  the  overarching 
team.  One  objective  of  integrated  product  teams,  and  dot&e  participation 
in  particular,  is  to  expedite  the  approval  process  of  test  documents  by 
reaching  agreement  on  the  strategy  and  plan  through  the  identification  and 
resolution  of  issues  early,  understanding  the  issues,  and  documenting  a 
quality  test  and  evaluation  master  plan  that  is  acceptable  to  aU 
organizational  levels  the  first  time.  Integrated  product  teams  are  designed 
to  replace  a  previously  sequential  test  and  evaluation  master  plan 
development  and  approval  process  and  therefore  enhance  timeliness. 
While  this  management  tool  could  increase  communication  between 


^^10  U.S.C.  section  139  assigns  six  responsibilities  to  the  Director,  the  fifth  of  which  is  to  “review  and 
make  recommendations  to  the  Secretary  of  Defense  on  all  budgetary  and  financial  matters  relating  to 
operational  test  and  evaluation,  including  operational  test  facilities  and  equipment  {emphasis  added), 
in  the  Department  of  Defense. 

Test  and  Evaluation:  The  Director,  Operational  Test  and  Evaluation's  Role  in  Test  Resources 
(GAO/NSIAD-90-128,  Aug.  27,  1990),  we  found  that  the  Director’s  independence  was  jeopardized 
because  the  Director  had  influence  over  the  types  of  development  test  assets  used  by  the  services. 
Responsibility  for  developmental  test  resources  rests  with  the  services.  In  1987  Congress  amended 
DOT&E’s  statute  to  emphasize  the  separation  of  operational  testing  from  functions  associated  with 
developmental  testing  by  stating  that  “the  Director  may  not  be  assigned  any  responsibility  for 
developmental  test  and  evaluation,  other  than  the  provision  of  advice  to  officials  responsible  for  such 
testing.” 
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testers  and  the  program  managers,  it  also  poses  a  challenge  to  dot&e 
independence.  The  challenge  was  recognized  by  the  Department  of 
Defense  Inspector  General  (dod  ig)  when  after  reviewing  the  conduct  of 
operational  testing  it  subsequently  recommended  that  “to  meet  the  intent 
of  10  U.S.C.  139,  DOT&E  should  be  a  nonvoting  member  [of  the 
working-level  integrated  product  team]  so  as  to  maintain  his 
independence.”*®  (emphasis  added)  Though  integrated  product  teams  were 
not  used  throughout  the  entire  time  period  covered  by  this  report,  several 
action  officers  noted  that  this  management  tool  created  threats  to  their 
effectiveness  other  than  having  their  positions  out-voted.  One  dot&e  action 
officer  reported  having  the  lone  dissenting  opinion  in  a  meeting  of  30 
participants  seeking  to  reach  consensus  and  resolve  issues  early.  The 
pressure  of  maintaining  independent,  contrary  positions  in  large  working 
groups  can  be  a  test.  Several  dot&e  representatives  also  noted  that  the 
frequency  of  integrated  product  team  meetings  to  cover  the  multiple 
systems  for  which  they  were  responsible  made  it  impossible  for  them  to 
attend  all,  thereby  lessening  the  possibility  that  testing  issues  can  be 
identified  and  resolved  as  early  as  possible. 

Moreover,  program  managers  and  dot&e  pursue  different  objectives 
through  integrated  product  teams.  The  services  and  program  managers 
view  the  teams  as  a  way  to  facilitate  their  program  objectives  for  cost, 
schedule,  and  performance;  dot&e’s  objective  is  oversight  of  performance 
through  operational  testing.  The  program  managers  and  dot&e  share  a 
desire  to  identify  testing  issues  as  early  as  possible.  However,  the  priorify 
of  the  program  manager  to  resolve  these  issues  as  early  as  possible 
through  the  teams  may  conflict  with  dot&e’s  mission,  dot&e  must  remain 
flexible  and  react  to  unknowns  as  they  are  disclosed  during  developmental 
testing,  operational  assessments,  and  initial  operational  test  and 
evaluation.  Thus,  dot&b’s  participation  on  the  teams  is  a  natural  sovurce  of 
tension  and  a  potential  impediment  to  the  team’s  decision-making.  The 
challenge  for  dot&e  action  officers  is  to  maintain  an  independent  and 
potentially  contrary  position  in  an  ongoing  working  group  during  the  life 
of  a  program,  which  may  extend  over  several  years. 


Increased  Integration  of 
Developmental  and 
Operational  Testing  May 
Attenuate  Independent 
Oversight 


The  objectives  of  developmental  and  operational  testing  are  distinct. 
Developmental  testing  determines  whether  a  system  meets  its  functional 
requirements  and  contractual  technical  performance  criteria  sufficiently  to 
proceed  with  operational  testing.  Operational  testing  determines  whether 
the  system  meets  the  operational  requirements  and  will  contribute  to 


^^See  Department  of  Defense  Office  of  the  Inspector  General,  Operational  Testing  Performed  on 
Weapons  Systems,  Report  No.  96-107,  May  6, 1996,  p.  11. 
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mission  effectiveness  in  relevant  operational  environments  sufficiently  to 
justify  proceeding  with  production.  The  integration  of  these  two  disparate 
test  activities  is  proposed  to  save  the  time  and  resources  required  for 
testing  and  evaluation.  The  sentiment  to  more  closely  link  developmental 
and  operational  testing  dates  from  at  least  the  1986  Blue  Ribbon 
Commission  on  Defense  Management  (Packard  Commission),  which 
foimd  that  “developmental  and  operational  testing  have  been  too  divorced, 
the  latter  has  been  imdertaken  too  late  in  the  cycle,  and  prototypes  have 
been  used  and  tested  far  too  little.”^®  However,  both  we  and  the  dod  ig 
have  found  that  systems  were  regularly  tested  before  they  were  ready  for 
testing.  In  its  1996  report,  the  dod  ig  reported  that  “4  of  15  systems  we 
examined  for  operational  testing  were  not  ready  for  testing.  This  situation 
occinred  because  a  calendar  schedule  rather  than  system  readiness  often 
drove  the  start  of  testing.”^^  Similarly,  we  have  observed  numerous 
systems  that  have  been  pushed  into  low-rate  initial  production  without 
sufficient  testing  to  demonstrate  that  the  system  will  work  as  promised  or 
intended.  Our  reviews  of  major  system  development  in  recent  years  have 
foimd  that  because  insufficient  time  was  dedicated  to  initial  testing, 
systems  were  produced  that  later  experienced  problems  during 
operational  testing  and  systems  entered  initial  production  despite 
experiencing  problems  during  early  operational  testing.^^ 

In  1996  the  Secretary  of  Defense  also  urged  the  closer  integration  of 
developmental  and  operational  testing,  and  combined  tests  where 
possible,  in  part  to  enhance  the  objectives  of  acquisition  reform. 

Combined  developmental  and  operational  testing  is  only  one  of  many 
sources  of  test  data  that  dot&e  has  used  to  foster  more  timely  and 
thorough  operational  test  and  evaluation.  Other  sources  of  information 
include  contractor  developmental  testing,  builder’s  trials,  component 
testing,  production  lot  testing,  stoclq)ile  rehabUity  testing,  and  operational 
deployments.  While  dot&e  has  some  influence  over  the  quality  of 
operational  testing,  by  independently  reviewing  the  design,  execution, 
analysis,  and  reporting  of  such  tests,  it  has  no  direct  involvement  or 
oversight  of  these  other  sources  of  testing  information.  The  use  of 
alternative  sources  of  test  data  as  substitutes  for  operational  test  and 
evaluation  will  limit  dot&e’s  oversight  mission,  which  was  created  to 
improve  the  conduct  and  qualify  of  testing. 


Quest  for  Excellence:  Final  Report  by  the  President’s  Blue  Ribbon  Commission  on  Defense 
Management,  June  1986,  p.  xxiii. 

^'^Office  of  the  Inspector  General,  Department  of  Defense,  Operational  Testing  Performed  on  Weapons 
Systems,  Report  No.  96-107,  May  6, 1996,  p.  16. 

^®See  Weapons  Acquisition:  Low-Rate  Initial  Production  Used  to  Buy  Weapon  Systems  Prematurely 
(GAO/NSlAD-95-18,  Nov.  21, 1994). 
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Conclusions  and 
Recommendations 


dot&e’s  challenge  is  to  manage  an  expai\sion  in  independent  oversight 
while  satisfying  the  efficiency  goals  of  acquisition  reform  and  undergoing 
the  economic  pressures  of  downsizing,  dot&e  oversight  is  clearly  affecting 
the  operational  testing  of  new  defense  systems,  dot&e  actions  (such  as  the 
insistence  on  additional  testing,  more  realistic  testing,  more  rigorous  data 
analysis,  and  independent  assessments)  are  resulting  in  more  assurance 
that  new  systems  fielded  to  our  armed  forces  are  safe,  suitable,  and 
effective.  However,  dot&e  is  not,  by  design  or  practice,  the  guarantor  of 
effective  and  suitable  acquisitions,  dot&e  oversight  reduces,  but  does  not 
eliminate,  the  risk  that  new  systems  will  not  be  operationally  effective  and 
suitable.  Affirmative  full-rate  production  decisions  are  made  for  systems 
that  have  yet  to  demonstrate  their  operational  effectiveness  or  suitability. 
Moreover,  the  services  question  dot&e’s  authority  regarding  follow-on  test 
and  evaluation  of  subsequent  corrective  actions  by  the  program  office. 

We  recommend  that  the  Secretary  of  Defense  revise  dod’s  operational  test 
and  evaluation  policies  in  the  following  ways: 

Require  the  Under  Secretary  of  Defense  for  Acquisition  and  Technology,  in 
those  cases  where  affirmative  fuU-rate  production  decisions  are  made  for 
major  systems  that  have  yet  to  demonstrate  their  operational  effectiveness 
or  suitability,  to  (1)  take  corrective  actions  to  eliminate  deficiencies  in 
effectiveness  or  suitabihty  and  (2)  conduct  follow-on  test  and  evaluation 
of  corrective  actions  until  the  systems  are  determined  to  be  operationally 
effective  and  suitable  by  the  Director,  Operational  Test  and  Evaluation. 
Require  the  Director,  Operational  Test  and  Evaluation,  to  (1)  review  and 
approve  follow-on  test  and  evaluation  master  plans  and  specific 
operational  test  plans  for  major  systems  before  operational  testing  related 
to  suitabihty  and  effectiveness  issues  left  unresolved  at  the  full-rate 
production  decision  and  (2)  upon  the  completion  of  follow-on  operational 
test  and  evaluation,  report  to  Congress,  the  Secretary  of  Defense,  and  the 
Under  Secretary  of  Defense  for  Acquisition  and  Technology  whether  the 
testing  was  adequate  and  whether  the  results  confirmed  the  system  is 
operationally  suitable  and  effective. 

Further,  in  hght  of  increasing  operational  testing  oversight  commitments 
and  to  accommodate  oversight  of  follow-on  operational  testing  and 
evaluation,  we  recommend  that  the  Director,  Operational  Test  and 
Evaluation,  prioritize  his  Office’s  workload  to  ensure  sufficient  attention  is 
given  to  major  defense  acquisition  programs. 
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Agency  Comments 
and  Our  Evaluation 


In  commenting  on  a  draft  of  this  report,  dod  concurred  with  our  first  and 
third  recommendations  and  partially  concurred  with  our  second 
recommendation.  Concerning  the  recommendation  Avith  which  it  partially 
concurred,  dod  stated  that  system  specific  reports  to  the  Secretary  of 
Defense  and  Congress  are  not  warranted  for  every  system  that  requires 
follow-on  operational  test  and  evaluation,  dod  pointed  out  that  for  specific 
programs  designated  for  follow-on  oversight,  test  plans  are  prepared  to 
correct  previously  identified  deficiencies  by  milestone  III,  and  dot&e 
includes  the  results  of  follow-on  testing  in  its  next  annual  report. 

We  continue  to  beheve  our  recommendation  has  merit.  We  recommended 
that  the  Secretary  require  dot&e  approval  of  follow-on  test  and  evaluation 
of  corrective  actions  because  during  our  review  we  found  no  consensus 
within  the  defense  acquisition  community  concerning  dot&e’s  role  in 
follow-on  operational  test  and  evaluation.  In  its  comments  dod  did  not 
indicate  whether  it  intended  to  give  dot&e  a  role  in  follow-on  operational 
test  and  evaluation  that  is  comparable  to  its  role  in  initial  operational  test 
and  evaluation.  Moreover,  we  continue  to  believe  that  if  a  major  system 
goes  into  full-rate  production  (even  though  it  was  deemed  by  dot&e  not  to 
be  operationally  suitable  and  effective)  based  on  the  premise  that 
corrections  will  be  made  and  some  foUow-on  operational  test  and 
evaluation  will  be  performed,  dot&e  should  report,  as  promptly  as 
possible,  whether  or  not  the  follow-on  operational  test  and  evaluation 
results  show  that  the  system  in  question  had  improved  sufficiently  to  be 
characterized  as  both  operationally  suitable  and  effective. 

dod’s  comments  are  reprinted  in  their  entirety  in  appendix  IV,  along  with 
om  specific  evaluation. 
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As  agreed  with  your  offices,  unless  you  publicly  announce  its  contents 
earlier,  we  plan  no  further  distribution  of  this  report  until  15  days  after  its 
date  of  issue.  We  will  then  send  copies  to  other  congressional  committees 
and  the  Secretary  of  Defense.  We  will  also  make  copies  available  to  others 
upon  request. 

If  you  have  any  questions  or  would  like  additional  information,  please  do 
not  hesitate  to  call  me  at  (202)  612-3092  or  the  Evaluator-in-Charge,  Jeff 
Harris,  at  (202)  512-3583. 


Kwai-Cheimg  Chan 

Director  of  Special  Studies  and  Evaluation 
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Scope  and  Methodology 


To  develop  information  for  this  report,  we  selected  a  case  study 
methodology — evaluating  the  conduct  and  practices  of  dot&e  through  an 
analysis  of  13  weapon  systems.  Recognizing  that  many  test  and  evaluation 
issues  are  unique  to  individual  systems,  we  determined  that  a  case  study 
methodology  would  offer  the  greatest  probability  of  illuminating  the 
variety  of  factors  that  impact  the  value  or  effectiveness  of  oversight  at  the 
level  of  the  Office  of  the  Secretary  of  Defense  (osd).  Moreover,  with  nearly 
200  systems  subject  to  review  of  the  Director,  Operational  Test  and 
Evaluation  (dot&e)  at  any  one  time,  we  sought  a  sample  that  would  enable 
us  to  determine  if  the  Office  had  any  impact  as  well  as  the  ability  to 
examine  the  variety  of  programs  overseen.  Therefore,  we  selected  a 
judgmental  sample  of  cases  reflecting  the  breadth  of  program  types.  As 
illustrated  m  table  I.l,^  we  selected  systems  (1)  from  each  of  the  primary 
services,  (2)  categorized  as  major  defense  systems,  and  (3)  representing  a 
wide  array  of  acquisition  and  testing  phases — from  early  operational 
assessments  through  and  beyond  the  full-rate  production  decision.  We 
studied  both  new  and  modified  systems. 


Table  1.1 :  Characteristics  of  Weapon  Systems  Used  for  Case  Studies 

Estimated  or  actual 


System 

Service(s) 

Acquisition  category^ 

year  of  seiected 
development  phase 

New  or  modification 
of  existing  system 

AH-64D  Longbow 

Apache  helicopter 

Army 

ID 

MS  ili  (1995): 
lOT&E  (1995) 

Modification 

Airborne  Self- Protection 
Jammer 

Navy 

1D 

FOT&E  (1995-96); 

Bosnia  (1995) 

New 

C-17A  Airlifter 

Air  Force 

ID 

FOT&E  (1996-98): 
MSIilB(1995) 

New 

E-3  AWACS  Radar  System 
Improvement  Program 

Air  Force 

1C 

AFSARC  lil  (1997); 
lOT&E  (1995-96) 

Modification 

F-22  fighter  aircraft 

Air  Force 

ID 

MS  lil  (2003); 
lOT&E  (2002): 

LRIP(1999) 

New 

Javelin  missile 

Army 

ID 

MS  ill  (1997); 

LUT  (1996): 

UE  (1996) 

New 

(continued) 


^Table  1. 1  lists  the  lead  service,  program  size,  and  acquisition  or  testing  phase  for  each  of  the  case 
study  systems,  as  well  as  whether  the  program  is  a  development  effort  or  a  modification  of  an  existing 
system. 
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System 

Service(s) 

Acquisition  category^ 

Estimated  or  actual 
year  of  selected 
development  phase 

New  or  modification 
of  existing  system 

Joint  Surveillance  Target 

Attack  Radar  System 

E-8  aircraft 

Air  Force 

ID 

FOT&E(1997): 

MS  Mi  (1996); 

Bosnia  (1 996) 

New 

Common  ground  station 

Army 

ID 

MS  lii  (1998); 
iOT&E  (1997-98); 

Bosnia  (1995) 

New 

LPD-17  Amphibious  Assauit 
Ship 

Navy 

ID 

MS  li  (1996); 

EOA-2  (1996); 

EOA-1  (1994-95) 

New 

M1A2  tank 

Army 

ID 

FOT&E  (1995-96); 

MS  lii  (1994) 

Modification 

Sensor  Fuzed  Weapon 

Air  Force 

ID 

FOT&E-1  (1997-98); 

MS  lii  (1996); 
iOT&E-2  (1995-96) 

New 

Standard  Missile  SM-2 

Block  IIIB  version 

Navy 

II 

FOT&E  (1997); 

MS  ili  (1996); 

OPEVAL  (1996) 

Modification 

Block  IV  version 

Navy 

ID 

MS  iii  (1997); 

DT/IOT&E  (1994) 

Modification 

Tomahawk  Weapon  System 

Baseline  ill 

Navy 

1C 

MS  lii  (1998); 

OPEVAL  (1998); 
iOT&E  (1997) 

Modification 

Baseline  IV 

Navy 

1C 

MS  iil  (2000); 

OPEVAL  (1999-00); 
lOT&E  (1999) 

Modification 

V-22 

Navy 

ID 

OPEVAL  (1999); 
OT-liC(1996) 

New 

(Table  notes  on  next  page) 
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Legend 


AFSARC  =  Air  Force  Systems  Acquisition  Review  Council 

DT  =  developmental  testing 

EGA  =  early  operational  assessment 

FOT&E  =  foilow-on  operational  test  and  evaluation 

lOT&E  =  initial  operational  test  and  evaluation 

LRIP  =  low-rate  initial  production 

LUT  =  limited  user  test 

MS  =  milestone 

OA  =  operational  assessment 

OPEVAL  =  operational  evaluation 

OT  ==  operational  testing 

UE  =  user  evaluation 

^The  Under  Secretary  of  Defense  for  Acquisition  and  Technology  (USD  (A&T))  designates  major 
defense  acquisition  programs  as  either  acquisition  category  ID  or  1C.  The  milestone  decision 
authority  for  category  1 D  programs  is  USD  (A&T).  The  milestone  decision  authority  for  category 
1C  programs  is  the  Department  of  Defense  (DOD)  component  head  or,  if  delegated,  the  DOD 
component  acquisition  executive.  Category  I  programs  are  major  defense  acquisition  programs 
estimated  to  require  more  than  $355  million  (fiscal  year  1 996  constant  dollars)  for  expenditures  in 
research,  development,  test,  and  evaluation,  or  more  than  $2,135  billion  (fiscal  year  1996 
constant  dollars)  for  procurement.  Category  II  programs  are  those  that  do  not  meet  the  criteria  for 
category  I  but  do  meet  the  criteria  for  a  major  system.  A  major  system  is  estimated  to  require 
more  than  $75  million  in  fiscal  year  1980  constant  dollars  (approximately  $140  million  in  fiscal 
year  1996  constant  dollars)  for  expenditures  in  research,  development,  test,  and  evaluation,  or 
more  than  $300  million  in  fiscal  year  1 980  constant  dollars  (approximately  $645  million  in  fiscal 
year  1996  constant  dollars)  for  procurement. 


DOT&E,  the  service  operational  test  agencies,  and  the  Institute  for  Defense 
Analyses  (ida)  personnel  agreed  that  dot&e  was  influential  in  the  testing 
done  on  these  13  systems.  In  several  cases,  the  participating  agencies 
vehemently  differed  on  the  value  of  dot&e’s  actions;  however,  whether 
DOT&E  had  an  impact  on  testing  (he  it  perceived  as  positive  or  negative) 
was  not  in  dispute. 

In  conducting  our  13  case  studies,  we  assessed  the  strengths  and 
weaknesses  of  the  organizational  framework  in  dod  for  operational  testing 
via  test  agency  representatives,  an  assessment  on  the  origins  and 
implementation  (exemplified  by  the  13  cases)  of  the  title  10  amendments 
creating  and  empowering  dot&e,  and  a  review  of  the  literature. 

To  compile  case  study  data,  we  interviewed  current  action  officers  in  both 
DOT&E  and  the  appropriate  operational  test  agency  and  reviewed 
documentation  provided  by  the  operational  test  agencies,  dot&e,  and  ida. 
Using  structured  questionnaires,  we  interviewed  12  dot&e  and  27 
operational  test  agency  action  officers  responsible  for  the  13  selected 
systems  as  well  as  managers  and  technical  support  personnel  in  each 
organization.  In  addition,  we  interviewed  the  commanders  of  each  of  the 
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service  testing  agencies  and  dot&e.  When  possible,  we  corroborated 
information  obtained  from  interviews  with  documentation,  including  test 
and  evaluation  master  plans,  beyond  low-rate  initial  production  reports, 
defense  acquisition  executive  siunmary  status  reports,  defense  acquisition 
memoranda,  and  interagency  correspondence. 

In  Washington,  D.C.,  we  obtained  data  from  or  performed  work  at  the 
Office  of  the  Director  of  Operational  Test  and  Evaluation,  osD;  Deputy 
Under  Secretary  of  Defense  for  Acquisition  Reform;  Directorate  of  Navy 
Test  and  Evaluation  and  Technology  Requirements,  Office  of  the  Chief  of 
Naval  Operations;  Test  and  Evaluation  Management  Agency,  Director  of 
Army  Staff;  Air  Force  Test  and  Evaluation  Directorate;  and  the  dod  Office 
of  the  Inspector  General.  We  also  reviewed  data  and  interviewed  officials 
from  the  Army  Operational  Test  and  Evaluation  Command  and  the 
Institute  for  Defense  Analyses,  Alexandria,  Virginia;  the  Navy  Commander, 
Operational  Test  and  Evaluation  Force,  Norfolk,  Virginia;  and  the  Air 
Force  Operational  Test  and  Evaluation  Command,  Kirtland  Air  Force 
Base,  New  Mexico. 

The  use  of  a  systematic  case  study  framework  enabled  us  to  identify  and 
categorize  the  types  of  impacts  attributable  to  dot&e  among  the  systems 
studied.  In  addition,  this  framework  enabled  us  to  identify  trends  among 
factors  that  correlate  with  dot&e  effectiveness.  However,  we  were  unable 
to  generalize  to  all  systems  subject  to  OSD-Ievel  oversight.  In  light  of  this 
limitation,  we  included  only  major  (high-cost)  systems  and  systems 
identified  by  dot&e  and  the  lead  operational  test  agency  as  having  been 
affected  by  dot&e  initiatives.  Moreover,  while  our  methodology  and  data 
collection  enabled  us  to  qualitatively  assess  the  impact  of  dot&e,  it  was  not 
sufficiently  rigorous  either  to  evaluate  the  cost-effectiveness  of  dot&e 
actions  or  to  determine  the  deterrent  effects,  if  any,  the  Office  exerts  over 
the  acquisition  and  testing  process.  Finally,  our  methodology  did  not 
enable  an  assessment  of  whether  the  additional  testing  requested  by  dot&e 
was  necessary  to  provide  full-rate  production  decisionmakers  the  essential 
information  on  a  system’s  operational  effectiveness  and  suitability  or 
whether  the  additional  data  was  worth  the  time,  expense,  and  resources 
necessary  to  obtain  it. 

Our  review  was  performed  from  June  1996  through  March  1997  in 
accordance  with  generally  accepted  government  auditing  standards. 
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Description  of  13  Case  Study  Systems 

AH-64D  Longbow 
Apache  Helicopter 

The  AH-64D  Longbow  Apache  is  a  remanufactured  and  upgraded  version 
of  the  AH-64A  Apache  helicopter.  This  Army  system  is  equipped  with  a 
mast-mounted  fire  control  radar,  fire-and-forget  radio  frequency  Hellfire 
missile,  and  airframe  improvements  (i.e.,  integrated  cockpit,  improved 
engines,  and  global  positioning  system  navigation). 

Airborne 

Self-Protection 

Jammer 

The  Airborne  Self-Protection  Jammer  is  a  defensive  electronic 
countermeasmes  system  using  reprogrammable  deceptive  jamming 
techniques  to  protect  tactical  aircraft  from  radar-guided  weapons.  This 
Navy  system  is  intended  to  protect  Navy  and  Marine  Corps  F-18  and  F-14 
aircraft. 

C-17AAirlifter 

The  C-17A  Airlifter  provides  strategic/tactical  transport  of  all  cargo, 
including  outsized  cargo,  mostly  to  main  operational  bases  or  to  small, 
austere  airfields,  if  needed.  Its  four-engine  turbofan  design  enables  the 
transport  of  large  payloads  over  intercontinental  ranges  without  refueling. 
This  Air  Force  aircraft  will  replace  the  retiring  C-141  aircraft  and  augment 
the  C-130  and  C-5  transport  fleets. 

E-3  AWACS  Radar 
System  Improvement 
Program 

The  Air  Force’s  E-3  awacs  consists  of  a  Boeing  707  airframe  modified  to 
carry  a  radome  housing  a  pulse-Doppler  radar  capable  of  detecting  aircraft 
and  cruise  missiles,  particularly  at  low  altitudes.  The  Radar  System 
Improvement  Program  replaces  several  components  of  the  radar  to 
improve  detection  capability  and  electronic  covmtermeasures  as  well  as 
reliabiUty,  availability,  and  maintainability. 

F-22  Air  Superiority 
Fighter 

The  F-22  is  an  air  superiority  aircraft  with  a  capability  to  deliver 
air-to-ground  weapons.  The  most  significant  features  include  supercruise, 
the  ability  to  fly  efficiently  at  supersonic  speeds  without  using 
fuel-consuming  afterburners,  low  observability  to  adversary  systems  with 
the  goal  to  locate  and  shoot  down  the  F-22,  and  integrated  avionics  to 
significantly  improve  the  pilot’s  battlefield  awareness. 

Javelin  Missile 

The  Javehn  is  a  man-portable,  antiarmor  weapon  developed  for  the  Army 
and  the  Marine  Corp  to  replace  the  aging  Dragon  system.  It  is  designed  as 
a  fire-and-forget  system  comprised  of  a  missile  and  reusable  command 
laimch  unit. 
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Joint  Surveillance 
Target  Attack  Radar 
System 

The  Joint  Surveillance  Target  Attack  Radar  System  is  designed  to  provide 
intelligence  on  moving  and  stationary  targets  to  Air  Force  and  Army 
command  nodes  in  near  real  time.  The  system  comprises  a  modified 

Boeing  707  aircraft  frame  equipped  with  radar,  communications 
equipment,  and  the  air  component  of  the  data  link,  computer 
workstations,  and  self-defense  suite  as  well  as  ground  station  modules 
mounted  on  Army  vehicles. 

LPD-17  Amphibious 
Assault  Ship 

The  LPD-17  will  be  an  amphibious  assault  ship  capable  of  launching 
(1)  amphibious  assault  craft  from  a  well  deck  and  (2)  helicopters  or 
vertical  takeoff  and  landing  aircraft  from  an  aft  flight  deck.  It  is  intended 
to  transport  and  deploy  combat  and  support  elements  of  Marine 
expeditionary  brigades  as  a  key  component  of  amphibious  task  forces. 

M1A2  Abrams  Main 
Battle  Tank 

The  M1A2  Abrams  main  battle  tank  is  an  upgrade  of  the  MlAl  and  is 
intended  to  improve  target  acquisition  and  engagement  rates  and 
siuvivability  while  sustaining  equivalent  operational  suitability. 

Specifically,  the  modified  tank  incorporates  a  commander’s  independent 
thermal  viewer,  a  position  navigation  system,  and  an  intervehicle 
command  and  control  system. 

Sensor  Fuzed  Weapon 

The  Sensor  Fuzed  Weapon  is  an  antiarmor  cluster  munition  to  be 
employed  by  fighter,  attack,  or  bomber  aircraft  to  achieve  multiple  kills 
per  pass  against  armored  and  support  combat  formations.  Each  munition 
contains  a  tactical  mimitions  dispenser  comprising  10  submunitions 
containing  a  total  of  40  infrared  sensing  projectiles.  High-altitude  accuracy 
is  to  be  improved  through  the  incorporation  of  a  wind-compensated 
munition  dispenser  upgrade. 

Standard  Missile-2 

The  Standard  Missile-2  is  a  solid  propellant-fueled,  tail-controlled, 
surface-to-air  missile  fired  by  surface  ships.  It  was  originally  designed  to 
counter  high-speed,  high-altitude  antiship  missiles  in  an  advanced 
electronic  countermeasures  environment.  The  block  IIIA  version  provides 
improved  capacity  against  low-altitude  targets  with  an  improved  warhead. 
The  block  IIIB  adds  an  infrared  seeker  to  the  block  niA  to  enhance  the 
missile’s  capabilities  against  specific  threats.  These  improvements  are 
being  made  to  provide  capability  against  theater  ballistic  missiles  while 
retaining  its  capabilities  against  antiair  warfare  threats. 
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Tomahawk  Weapon 
System 


The  Tomahawk  Weapon  System  is  a  long-range  subsonic  cruise  missile  for 
land  and  sea  targets.  The  baseline  IV  upgrade  is  fitted  with  a  terminal 
seeker,  video  data  Unk,  and  two-way  digital  data  link.  The  primary 
baseline  IV  configuration  is  the  Tomahawk  multimission  missile;  a  second 
variant  is  the  Tomahawk  hard  target  penetrator. 


V-22  Osprey 


The  V-22  is  a  tilt  rotor  vertical'short  takeoff  and  landing,  multimission 
aircraft  developed  to  fulfill  operational  combat  requirements  in  the  Marine 
Corps  and  Special  Operations  Forces. 
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DOT&E’s  and  DOD’s  System  Acquisition 
Process 


dot&e’s  role  in  the  system  acquisition  process  does  not  become  prominent 
rmtil  the  latter  stages.  As  weapon  system  programs  progress  through 
successive  phases  of  the  acquisition  process,  they  are  subject  to  major 
decision  points  called  milestones.  The  milestone  review  process  is 
predicated  on  the  principle  that  systems  advance  to  higher  acquisition 
phases  by  demonstrating  that  they  meet  prescribed  technical  and 
performance  thresholds.  Figure  ni.l  illustrates  dob’s  weapon  system 
acquisition  process. 


Figure  III.1:  DOD’s  Weapon  System  Acquisition  Process 
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Per  DOD  directive,  test  and  evaluation  planning  begins  in  phase  0,  Concept 
Exploration.  Operational  testers  are  to  be  involved  early  to  ensure  that  the 
test  program  for  the  most  promising  alternative  can  support  the 
acquisition  strategy  and  to  ensure  the  harmonization  of  objectives, 
thresholds,  and  measures  of  effectiveness  in  the  operational  readiness 
document  and  the  test  and  evaluation  master  plan.  Early  testing  of 
prototypes  in  phase  I,  Program  Definition  and  Risk  Reduction,  and  early 
operational  assessments  are  to  be  emphasized  to  assist  in  identifying  risks. 
A  combined  developmental  and  operational  test  approach  is  encouraged 
to  save  time  and  costs.  Initial  operational  test  and  evaluation  is  to  occur 
during  phase  II  to  evaluate  operational  effectiveness  and  suitability  before 
the  full-rate  production  decision,  milestone  III,  on  all  acquisition  category  I 
and  n  programs.  For  all  acquisition  category  I  programs  and  other 
programs  designated  for  osD  test  and  evaluation  oversight,  a  test  and 
evaluation  master  plan  is  prepared  and  submitted  for  approval  prior  to 
first  milestone  review  (excluding  milestone  0).^  The  master  plan  is  to  be 
updated  at  milestones  when  the  program  has  changed  significantly,  dot&e 
must  approve  the  test  and  evaluation  master  plan  and  the  more  specific 
operational  test  plans  prior  to  their  execution.  This  process  and  the 
required  plan  approvals  provide  dot&e  opportunities  to  affect  the  design 
and  execution  of  operational  testing  throughout  the  acquisition  process. 


^Master  plans  for  acquisition  category  I  programs  are  to  be  submitted  to  the  Director,  Test  Systems 
Engineering  and  Evaluation,  30  days  prior  to  the  first  milestone.  For  all  other  programs  designated  for 
OSD  oversight,  the  plans  must  be  submitted  90  days  prior  to  the  first  milestone. 
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Note:  GAO  comments 
supplementing  those  in  the 
report  text  appear  at  the 
end  of  this  appendix. 


See  comment  1. 


See  comment  2. 


See  comment  3. 


OFFICE  OF  THE  SECRETARY  OF  DEFENSE 

WASHINGTON.  DC  2030 M  700 


fl  B  SE?  i§97 


Mr.  Kwai-Cheiing  Chan 

Director,  Special  Studies  and  Evaluation 

National  Security  and  International  Affairs  Division 

U.S.  General  Accounting  Office 

Washington,  DC  20548 

Dear  Mr  Chan: 

This  is  the  Department  of  Defense  (DoD)  response  to  the  General  Accounting  Office 
(GAO)  draft  report,  “TEST  AND  EVALUATION:  Impact  of  DoD’s  Office  of  the  Director  of 
Operational  Test  and  Evaluation,”  dated  August  22, 1997,  (GAO  CODE  973444/OSD  Case 
1443).  DoD  partially  concurs  with  the  recommendations  in  the  draft  report. 

The  DoD  is  committed  to  fielding  weapon  systems  that  substantially  improve  the 
warfighters’  capabilities  in  a  timely  and  affordable  manner.  In  the  current  climate  of  improving 
efficiency,  some  departures  from  the  past  ways  of  doing  business  are  warranted.  The  existence 
of  the  DOT&E  organization,  closely  involved  with,  but  independent  of  the  acquisition 
community,  helps  ensure  that  we  can  improve  the  way  we  acquire  effective,  suitable,  and 
surv'ivable  new  systems.  The  information  gained  from  well-run,  independent  OT&E  activities  is 
being  heard  and  considered  as  part  of  the  acquisition  decision-making  process. 

The  discussion  in  the  GAO  Report  on  the  Secretary  of  Defense  initiatives  for  operational 
test  and  evaluation  does  not  do  justice  to  these  important  themes.  These  initiatives,  especially 
the  early  involvement  of  operational  testers  in  acquisition  programs,  are  producing  earlier  insights 
on  the  performance  of  military’  systems.  It  makes  no  sense  to  wait  until  Milestone  III  to 
discover  problems  which  could  have  been  learned  and  corrected  years  earlier. 

The  GAO  also  discusses  the  small  DOT&E  test  resource  management  planning  and 
leadership  role  in  the  context  that  it  may  detract  from  testing  oversight  missions.  Since  the 
financial  and  human  test  resources  budgeted  by  the  Services  have  been  declining,  the  very  ability 
of  the  Services  to  perform  adequate  operational  testing  is  at  stake.  As  GAO  notes,  DOT&E  is 
responsible  by  statute  to  review  and  make  recommendations  to  the  Secretary  of  Defense  on  all 
budgetary  matters  relating  to  operational  testing.  Therefore  it  is  surprising  that  GAO  considers 
DOT&E’ s  efforts  in  test  resources  as  potentially  impacting  the  adequacy  of  oversight  rather  than 
contributing  to  the  quality  of  operational  test  and  evaluation  in  DoD. 


OPERATIONAL  TEST 
AND  EVALUATfON 
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Detailed  responses  to  each  of  the  GAO  recommendations  are  contained  in  the  attachment 
DoD  appreciates  the  opportunity  to  review  and  comment  on  the  draft  report. 


Philip  E.  Coyle 
Director 


Attachment: 
As  stated 
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Now  on  p.  22. 


See  comment  4. 


Now  on  p.  22. 


GAO  DRAFT  REPORT  DATED  AUGUST  22, 1997 
(GAO  CODE  973444)  OSD  CASE  1443 

“TEST  AND  EVALUATION:  IMPACT  OF  DOD’S  OFFICE  OF  THE  DIRECTOR  OF 
OPERATIONAL  TEST  AND  EVALUATION’’ 


DEPARTMENT  OF  DEFENSE  COMMENTS  ON 
THE  GAO  RECOMMENDATIONS 

RECOMMENDATION  1:  The  GAO  recommended  that  the  Secretary  of  Defense  require  the 
Under  Secretary  of  Defense  for  Acquisition  and  Technology,  in  those  cases  where  affirmative 
full-rate  production  decisions  are  made  for  systems  that  have  yet  to  demonstrate  their 
operational  effectiveness  or  suitability,  to  (1)  take  corrective  actions  to  eliminate  effectiveness 
and  suitability  deficiencies,  and  (2)  conduct  follow-on  test  and  evaluation  of  corrective  actions 
until  the  systems  are  determined  to  be  operationally  effective  and  suitable  by  the  Director, 
Operational  Test  and  Evaluation,  (p.  32/GAO  Draft  Report) 

POD  RESPONSE:  Concur.  In  cases  where  affirmative  full-rate  production  decisions  are  made 
for  systems  that  have  yet  to  demonstrate  their  operational  effectiveness  or  suitabilit>',  the 
milestone  decision  authority  may,  under  current  practice,  proceed  with  production  and  fielding  if 
other  conditions  warrant.  Commensurate  with  schedule  and  resource  constraints,  serious 
deficiencies  remaining  at  the  time  of  the  decision  to  proceed  beyond  low-rate  initial  production 
assume  a  high  priority  for  correction  and  re-testing  to  ensure  that  the  user’s  requirements  are  met. 

By  statute  and  DoD  Directive,  the  Director,  Operational  Test  and  Evaluation  (DOT&E),  is  the 
senior  advisor  to  the  Secretary  of  Defense  and  the  Defense  Acquisition  Executive  (DAE)  on 
operational  test  and  evaluation  matters.  Before  the  Milestone  III  decision,  the  program  Test  and 
Evaluation  Master  Plan  (TEMP)  must  be  approved  by  DOT&E.  At  that  time,  the  plans  for 
FOT&E  must  be  adequate  to  ensure  that  corrections  to  previous  deficiencies  are  thoroughly 
tested  and  evaluated. 

RECOMMENDATION  2:  The  GAO  recommended  that  the  Secretary  of  Defense  require  the 
Director,  Operational  Test  and  Evaluation,  to  (1)  review  and  approve  follow-on  test  and 
evaluation  master  plans  and  specific  operational  test  plans  prior  to  the  conduct  of  operational 
testing  related  to  suitability  and  effectiveness  issues  left  unresolved  at  the  fiill-rate  production 
decision,  and  (2)  report  to  the  Secretary  of  Defense,  the  Under  Secretary  of  Defense  for 
Acquisition  and  Technology',  and  the  Congress  upon  the  completion  of  follow-on  operational  test 
and  evaluation  whether  the  testing  was  adequate  and  whether  the  results  confirmed  the  system  is 
operationally  suitable  and  effective,  (p.  33/GAO  Draft  Report) 
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See  comment  5. 


Nowon  p.  23. 


POD  RESPONSE:  Partially  concur.  Title  10,  USC,  Section  139,  states  that  the  Director  shall 
monitor  and  review  all  OT&E  within  the  DoD.  As  a  practical  matter,  DOT&E  requires 
TEMP/test  plan  approval,  test  monitoring,  and  results  reporting  for  FOT&E,  as  described  in  the 
Director,  OT&E,  policy  memorandum  of  March  10, 1997,  for  programs  subject  to  oversight. 
DOT&E’ s  normal  practice  is  to  include  such  test  results  in  the  next  DOT&E  Annual  Report  to 
Congress.  DOT&E  already  submits  an  independent  report  to  the  Secretary  of  Defense  and  the 
congressional  committees  following  initial  OT&E,  and  before  the  decision  to  proceed  beyond 
low-rate  initial  production  is  finalized.  DoD  does  not  believe  that  a  system-specific  FOT&E 
report  to  the  Secretary  and  Congress  is  warranted  under  most  circumstances,  given  the  resource 
limitations  present,  and  lacking  extraordinary  requirements  for  such  a  report. 

RECOMMENDATION  3:  The  GAO  recommended,  in  light  of  increasing  operational  testing 
oversight  commitments  and  to  accommodate  oversight  of  follow-on  operational  testing  and 
evaluation  (FOT&E)  that  DOT&E  prioritize  the  office’s  workload  to  ensure  sufficient  attention 
is  given  to  major  defense  acquisition  programs,  (p.  33/GAO  Draft  Report) 

DOD  RESPONSE:  Concur.  Acquisition  reform  has  caused  a  great  deal  of  change  throughout  the 
range  of  acquisition-related  activities,  including  test  and  evaluation.  Much  of  this  change  has 
increased  the  workload  of  DOT&E  staff,  but  our  first  priority  vAW  continue  to  be  to  fulfill 
statutory  requirements,  including  the  oversight  of  major  defense  acquisition  programs’  OT&E 
and  live  fire  T&E.  The  prioritization  of  OT&E  activities  is  an  ongoing  and  continuous  process 
by  the  Director,  his  Deputies,  and  their  Action  Officers.  The  participation  by  DOT&E  Action 
Officers  in  integrated  product/process  team  activities  prevalent  in  acquisition  programs,  while 
demanding  of  time,  results  in  efficient  communication  among  diverse  organizations  involved  with 
test  planning,  conduct,  and  evaluation.  DoD  believes  that  through  this  improved  communication, 
better-integrated  and  more  efficient  test  programs  will  result  at  lower  cost,  and  less  time. 
Recognizing  the  demands  of  acquisition  reform  and  the  results  of  the  Quadrennial  Defense 
Review,  DOT&E  has  requested  resources  to  match  these  responsibilities. 
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The  following  are  gao’s  comments  on  the  September  19,  1997,  letter  from 
the  Department  of  Defense. 


GAO  Comments 


1.  In  prior  reviews  of  individual  weapon  systems,  we  have  found  that 
operational  testing  and  evaluation  is  generally  viewed  by  the  acquisition 
communily  as  a  costly  and  time-consuming  requirement  imposed  by 
outsiders  rather  than  a  management  tool  for  more  successful  programs. 
Efforts  to  enhance  the  efficiency  of  acquisition,  in  general — and  in 
operational  testing,  in  particular — ^need  to  be  well  balanced  with  the 
requirement  to  realistically  and  thoroughly  test  operational  suitability  and 
effectiveness  prior  to  the  full-rate  production  decision.  We  attempted  to 
take  a  broader  view  of  acquisition  reform  efficiency  initiatives  to 
anticipate  how  these  departiues  from  past  ways  of  doing  business  could 
impact  both  the  quality  of  operational  testing  and  the  independence  of 

DOT&E. 

2.  We  were  asked  to  assess  the  impact  of  dot&e  on  the  quality  and  impact 
of  testing  and  reported  on  the  Secretary  of  Defense  initiatives  only  to  the 
extent  they  may  pose  a  potential  impact  on  dot&e’s  independence  or 
effectiveness.  Moreover,  we  did  not  recommend  or  suggest  that  testers 
wait  imtil  milestone  III  to  discover  problems  that  could  have  been  learned 
and  corrected  earlier.  Since  its  inception,  dot&e  has  been  active  in  test 
integration  and  planning  working  groups  and  test  and  evaluation  master 
plan  development  during  the  earliest  phases  of  the  acquisition  process.  In 
fact,  we  have  long  advocated  more  early  testing  to  demonstrate  positive 
system  performance  prior  to  the  low-rate  initial  production  decision. 
dot&e’s  early  involvement  in  test  planning  is  appropriate,  necessary,  and 
required  by  dod  regulations.  In  this  report  we  do  not  advocate  the 
elimination  of  dot&e  participation  during  the  early  stages  of  the  acquisition 
process;  rather,  we  merely  observe  that  dot&e  participation  through  the 
vehicle  of  working-level  program  manager  integrated  product  teams  has 
the  potential  to  complicate  independence  and  may  be  increasingly  difficult 
to  implement  with  declining  resources  and  increasing  oversight 
responsibilities  following  milestone  m. 

3.  We  did  not  recommend  or  suggest  that  dot&e  ignore  its  statutory 
responsibihty  to  review  and  make  recommendations  to  the  Secretary  of 
Defense  on  budgetary  and  financial  matters  related  to  operational  test 
facihties  and  equipment.  We  only  observed  that  in  an  era  of  decUiung 
resources,  earlier  participation,  and  extended  oversight  responsibilities,  a 
decision  to  assume  a  larger  role  in  test  resource  management  planning  and 
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leadership  is  likely  to  result  in  tradeoffs  in  other  responsibilities — ^the 
largest  being  oversight. 

4.  We  made  this  recommendation  because  dot&e,  the  services,  and  the 
program  offices  did  not  necessarily  agree  on  the  degree  to  which  system 
performance  requirements  have  been  met  in  initial  operational  test  and 
evaluation.  Furthermore,  there  was  no  consensus  within  the  acquisition 
community  concerning  dot&e’s  authority  to  oversee  follow-on  operational 
test  and  evaluation  conducted  to  ensure  that  proposed  corrections  to 
previously  identified  deficiencies  were  thoroughly  tested  and  evaluated. 

6.  Under  10  U.S.C.  2399,  dot&e  is  required  to  independently  report  to 
Congress  whether  a  major  acquisition  system  has  proven  to  be 
operationally  suitable  and  effective  prior  to  the  full-rate  production 
decision.  When  follow-on  operational  test  and  evaluation  is  necessary  to 
test  measures  intended  to  correct  deficiencies  identified  in  initial 
operational  test  and  evaluation.  Congress  does  not  receive  an  equivalent 
independent  report  from  dot&e  that  concludes,  based  on  required 
follow-on  operational  test  and  evaluation,  whether  or  not  a  major  system 
has  improved  sufficiently  to  be  considered  both  operationally  suitable  and 
effective. 
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Appendix  IV 

Comments  From  the  Department  of  Defense 
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Related  GAO  Products 


Tactical  Intelligence:  Joint  STAES  Full-Rate  Production  Decision  Was 
Premature  and  Risky  (gao/nsl4D-97-68,  Apr.  25,  1997). 

Weapons  Acquisition:  Better  Use  of  Limited  pod  Acquisition  Funding 
Would  Reduce  Costs  (gao/nsiad-97-23,  Feb.  13, 1997). 

Airborne  Self-Protection  Jammer  (gao/nsiad-97-46R,  Jan.  29, 1997). 

Army  Acquisition:  Javelin  Is  Not  Ready  for  Multiyear  Procurement 
(gao/nsiad-96-109.  Sept.  26, 1996). 

Tactical  Intelligence:  Accelerated  Joint  STARS  Ground  Station  Acquisition 
Strategy  Is  Risky  (gao/nsl4D-96-71,  May  23,  1996). 

Electronic  Warfare  (gao/nsiad-96-io9R,  Mar.  1,  1996). 

Longbow  Apache  Helicopter:  System  Procurement  Issues  Need  to  Be 
Resolved  (gao/nsiad-95-159,  Aug.  24, 1995). 

Electronic  Warfare:  Most  Air  Force  ALQ-135  Jammers  Procured  Without 
Operational  Testing  (gao/nsiad-9547,  Nov.  22,  1994). 

Weapons  Acquisition:  Low-Rate  Initial  Production  Used  to  Buy  Weapon 
Systems  Prematurely  (gao/nsiad-96-18,  Nov.  21, 1994). 


Acquisition  Reform:  Role  of  Test  and  Evaluation  in  System  Acquisition 
Should  Not  Be  Weakened  (gao/t-nsiad-94-124.  Mar.  22, 1994). 

Test  and  Evaluation:  The  Director,  Operational  Test  and  Evaluation’s  Role 
in  Test  Resoiuces  (gao/nsiad-90-128,  Aug.  27, 1990). 

Adequacy  of  Department  of  Defense  Operational  Test  and  Evaluation 
(GAO/'r-NSiAD-89-39,  June  16,  1989). 

Weapons  Testing:  Quality  of  pod  Operational  Testing  and  Reporting 
(GAO/PEMD-88-32BR,  July  26,  1988). 


(973444) 
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Ordering  Information 


The  first  copy  of  each  GAO  report  and  testimony  is  free. 
Additional  copies  are  $2  each.  Orders  should  be  sent  to  the 
following  address,  accompanied  by  a  check  or  money  order 
made  out  to  the  Superintendent  of  Documents,  when 
necessary.  VISA  and  MasterCard  credit  cards  are  accepted,  also. 
Orders  for  100  or  more  copies  to  be  mailed  to  a  single  address 
are  discounted  25  percent. 

Orders  by  mail: 

U.S.  General  Accounting  Office 
P.O.  Box  37050 
Washington,  DC  20013 

or  visit: 

Room  1100 

700  4th  St.  NW  (comer  of  4th  and  G  Sts.  NW) 

U.S.  General  Accounting  Office 
Washington,  DC 

Orders  may  also  be  placed  by  calling  (202)  512-6000 

or  by  using  fax  number  (202)  512-6061,  or  TDD  (202)  512-2537. 

Each  day,  GAO  issues  a  list  of  newly  available  reports  tuid 
testimony.  To  receive  facsimile  copies  of  the  daily  list  or  any 
list  from  the  past  30  days,  please  call  (202)  512-6000  using  a 
touchtone  phone.  A  recorded  menu  will  provide  information  on 
how  to  obtain  these  lists. 

For  information  on  how  to  access  GAO  reports  on  the  INTERNET, 
send  an  e-mail  message  with  "info"  in  the  body  to: 

info@www.gao.gov 

or  visit  GAO’s  World  Wide  Web  Home  Page  at: 
http://www.gao.gov 
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